Method of sensor-assisted rate control

ABSTRACT

Methods and systems of determining a quantization step for encoding video based on motion data are provided. Video captured by an image capture device is received. The video comprises a video frame component. Additionally, motion data associated with the video frame component is received. Further, a quantization step for encoding the video frame component is determined based on the motion data.

CROSS-REFERENCE

This application is a continuation application of InternationalApplication No. PCT/CN2015/085759, filed Jul. 31, 2015, the contents ofwhich is hereby incorporated by reference in its entirety.

BACKGROUND

Video that is captured, such as video that is captured by unmannedaerial vehicles (UAVs), may be encoded by various methods. However,video encoding methods and systems for UAVs may be less than ideal. Forexample, packet loss may occur when captured video from a UAV is encodedand transmitted, especially when the video contains a large amount ofmovement.

Aerial vehicles, such as UAVs, have been developed for a wide range ofapplications including surveillance, search and rescue operations,exploration, and other fields. Such UAVs may often carry a camera moduleon-board for video capturing. Video that is captured by UAVs may containa large amount of movement.

SUMMARY

Maintenance of a constant bitrate (CBR) is an important aspect of modernvideo encoding technology. A CBR may be maintained when the number ofbits that are fed to a decoder remains constant, e.g. withinpredetermined thresholds, over time. The maintenance of a CBR isimportant for transmitting data, such as video, over a network. Inparticular, when bitrate of transmitted data fluctuates, packet lossand/or signal loss may result. The maintenance of a constant bitrate isalso important when processing data, such as video, using a codedpicture buffer (CPB) on the decoder side of a video encoding process. Inparticular, when bitrate of data that is being processed fluctuates, thedecoder buffer may overflow. As such, controlling bitrate when initiallyencoding data is an important feat when using an encoding processor.

Accordingly, a need exists for improved methods and systems for encodingvideo obtained from video capture devices so as to maintain a CBR whenthe video data is decoded. The video capture devices may be carried byunmanned vehicles, such as unmanned aerial vehicles (UAVs). Methods areprovided for encoding video captured by video capture devices, such asvideo capture devices on UAVs, by utilizing information from sensorsassociated with the UAV. In some embodiments, the video capture devicesmay capture video that includes motion data. Additionally, a UAV may usesensors that are associated with the UAV to capture information that maybe used to generate an optical flow field. When the captured video isaligned with a correlating optical flow field that is based on sensorinformation captured at a similar time as the video, the resultinginformation may be used to efficiently encode the video data. Inparticular, the aligned video and optical flow field data may be used toefficiently allocate bits and/or choose quantization steps for encodingportions of a video frame component. In particular, systems and methodsdescribed herein may be used to identify areas of video frames having ahigh degree of motion and may allocate more bits and/or utilize a higherquantization step when encoding the portions of video frame componentsthat are associated with a high degree of motion. For example, a higherquantization step may be used to encode a first video frame that isassociated with a high degree of motion, and a lesser quantization stepmay be used to encode a second video frame that is associated with adegree of motion that is not a high degree of motion. A high degree ofmotion may be determined when the degree of motion within a video frameexceeds a threshold degree of motion. Further, the degree of motion maybe assessed based on the degree of movement within a video frame.Additionally, the motion data that is associated with the video framecomponents may be determined based on an optical flow field that isassociated with the video frame components. Accordingly, methods may bedirected towards allocating bits and/or selecting quantization steps toencode video data based on information from an optical flow field. Inparticular, the optical flow field may be aligned with the video data soas to improve the efficiency of a video encoding process.

An optical flow field that is generated using sensor data from a UAV maybe used to efficiently encode video data that is aligned with thegenerated optical flow field. The video data may be encoded by one ormore processors at the UAV, video capture device, or carrier on-boardthe UAV. The video data may be encoded by one or more processorsexternal to the UAV, such as a user terminal that is communicativelyconnected to the UAV. Additionally, the optical flow field may begenerated at the UAV. Alternatively, the optical flow field may begenerated at an external location that is communicatively connected tothe UAV. The sensor information that is used to generate the opticalflow field may be detected at the UAV. Additionally or alternatively,the sensor information that is used to generate the optical flow fieldmay be provided to the UAV from an external source that iscommunicatively connected to the UAV. Accordingly, video data that iscaptured by a video capture device may be efficiently encoded using anoptical flow field that is generated based on sensor data that isassociated with the UAV.

In particular, an optical flow field that corresponds to video datacaptured by a video capture device may be used to efficiently allocatebits and/or select quantization steps for encoding portions of videodata. For example, when encoding video frames, the optical flow fielddata may be used to determine how many bits should be allocated toencode video data on a frame-by-frame basis. In examples when capturedvideo has very little movement, as determined by an optical flow fieldassociated with the video frame, an encoding processor may choose toallocate fewer bits to encoding the low movement video data on aframe-by-frame basis. Additionally, when portions of a video frame havelittle movement, as indicated by an optical flow field associated withthe video frame, the video encoder may choose to allocate fewer bits toencode those low movement portions of the video frame.

Further, when encoding video data, it is beneficial to break up videodata into video frame components and encode recognized similaritiesbetween video frame components, rather than encoding each frame over andover again. This approach may be especially beneficial when video framecomponents, such as blocks, are similar or duplicates across a number offrames (e.g., when driving towards mountains that are far away, themountains will look relatively the same across a number of video framecomponents). In particular, blocks that are similar or duplicates may beencoded based the differences, or residue, between the blocks. Thisresidue may require significantly fewer bits than re-encoding eachsimilar or duplicate block.

However, as some video data may have a great deal of movement, it issometimes difficult to associate blocks between video frames, even whenthere may be a great amount of similarity between at least some blocksof the two video frames. This is because, with great movement, the biasof the similar elements within a video frame may be shifted across avideo frame. For example, as a camera shifts right, objects of the videothat were formerly at the right edge of a video frame will be shifted tothe left. However, conventional methods of encoding video data are basedon the assumption that blocks at a particular location on a first videoframe are associated with blocks at the same particular location on asecond video frame. In these examples, the optical flow field data maybe used to reassess an algorithm that is used in balance therate-distribution optimization (RDO). In particular, the optical flowfield data that is associated with the video data may be used by anencoding processor to focus more bit allocation on encoding coefficientsbetween video frame components. Alternatively, the optical flow fielddata that is associated with the video data may be used by an encodingprocessor to focus more bit allocating on searching for motion vectorswithin video frame components.

Based on this shortcoming of conventional methods of encoding videodata, aspects of the invention provide the use of optical flow fielddata to contextualize video data. In particular, an optical flow fieldthat is aligned with the video data may be used by an encoding processorto allocate bits and/or select quantization steps for the encoding ofvideo frame components.

An aspect of the invention may include a method of determining aquantization step for encoding video based on motion data. The methodmay include receiving video captured by an image capture device, thevideo comprising a video frame component. The method may also includereceiving motion data associated with the video frame component.Additionally, the method may include determining a quantization step forencoding the video frame component based on the motion data.

In some embodiments, an aspect of the invention may includenon-transitory computer readable medium containing program instructionsfor determining a quantization step for encoding video based on motiondata. The computer readable medium may include program instructions forreceiving video captured by an image capture device, the videocomprising a video frame component. Additionally, the computer readablemedium may include program instructions for receiving motion dataassociated with the video frame component. The computer readable mediummay also include program instructions for determining a quantizationstep for encoding the video frame component based on the motion data.

Aspects of the invention may further include a system for determining aquantization step for encoding video based on motion data. The systemmay include an image capture device configured to capture a video. Thesystem may also include one or more processors, individually orcollectively configured to receive the video captured by the imagecapture device, the video comprising a video frame component. The one ormore processors may also be configured to receive motion data associatedwith the video frame component. Additionally, the one or more processorsmay be configured to determine a quantization step for encoding thevideo frame component based on the motion data.

In some other embodiments, aspects of the invention may include a methodof determining a quantization step for encoding video based on motiondata. The method may include receiving video captured by an imagecapture device, the video comprising a first video frame component and asecond video frame component. Additionally, the method may includereceiving motion data associated with the second video frame component.The method may also include determining a quantization step for encodingthe first video frame component based on the motion data associated withthe second video frame component.

Aspects of the invention may also include a non-transitory computerreadable medium containing program instructions for determining aquantization step for encoding video based on motion data. Thenon-transitory computer readable medium may include program instructionsfor receiving video captured by an image capture device, the videocomprising a first video frame component and a second video framecomponent. The non-transitory computer readable medium may also includeprogram instructions for receiving motion data associated with thesecond video frame component. Additionally, the non-transitory computerreadable medium may include program instructions for determining aquantization step for encoding the first video frame component based onthe motion data associated with the second video frame component.

Further aspects of the invention may include a system for determining aquantization step for encoding video based on motion data. The systemmay include an image capture device configured to capture a video. Thesystem may also include one or more processors, individually orcollectively configured to receive video captured by an image capturedevice, the video comprising a first video frame component and a secondvideo frame component. The one or more processors may also be configuredto receive motion data associated with the second video frame component.Additionally, the one or more processors may be configured to determinea quantization step for encoding the first video frame component basedon the motion data associated with the second video frame component.

Another aspect of the invention may include a method of bit allocationfor encoding video based on motion data. The method may includereceiving video captured by an image capture device, the videocomprising a video frame component. Additionally, the method may includereceiving motion data associated with the video frame component. Themethod may also include allocating bits associated with encoding thevideo frame component based on the motion data.

Additional aspects of the invention may include a non-transitorycomputer readable medium containing program instructions for bitallocation for encoding video based on motion data. The non-transitorycomputer readable medium may include program instructions for receivingvideo captured by an image capture device, the video comprising a videoframe component. The non-transitory computer readable medium may alsoinclude program instructions for receiving motion data associated withthe video frame component. Additionally, the non-transitory computerreadable medium may include program instructions for allocating bitsassociated with encoding the video frame component based on the motiondata.

Aspects of the invention may also include a system for bit allocationfor encoding video based on motion data. The system may include an imagecapture device configured to capture a video. Additionally, the systemmay include one or more processors configured to receive video capturedby an image capture device, the video comprising a video framecomponent. The one or more processors may also be configured to receivemotion data associated with the video frame component. Additionally, theone or more processors may be configured to allocate bits associatedwith encoding the video frame component based on the motion data.

Further, additional aspects of the invention may include a method of bitallocation for encoding video based on motion data. The method mayinclude receiving video captured by an image capture device, the videocomprising a first video frame component and a second video framecomponent. The method may also include receiving motion data associatedwith the second video frame component. Additionally, the method mayinclude allocating bits associated with encoding the first video framecomponent based on the motion data associated with the second videoframe component.

Aspects of the invention may also include a non-transitory computerreadable medium containing program instructions for bit allocation forencoding video based on motion data. The non-transitory computerreadable medium may include program instructions for receiving videocaptured by an image capture device, the video comprising a first videoframe component and a second video frame component. Additionally, thenon-transitory computer readable medium may include program instructionsfor receiving motion data associated with the second video framecomponent. The non-transitory computer readable medium may also includeprogram instructions for allocating bits associated with encoding thefirst video frame component based on the motion data associated with thesecond video frame component.

Additionally, aspects of the invention may include a system for bitallocation for encoding video based on motion data. The system mayinclude an image capture device configured to capture a video. Thesystem may also include one or more processors configured to receivevideo captured by an image capture device, the video comprising a firstvideo frame component and a second video frame component. Additionally,the one or more processors may be configured to receive motion dataassociated with the second video frame component. The one or moreprocessors may also be configured to allocate bits associated withencoding the first video frame component based on the motion dataassociated with the second video frame component.

It shall be understood that different aspects of the invention may beappreciated individually, collectively, or in combination with eachother. Various aspects of the invention described herein may be appliedto any of the particular applications set forth below or for any othertypes of movable objects. Any description herein of aerial vehicles,such as unmanned aerial vehicles, may apply to and be used for anymovable object, such as any vehicle. Additionally, the systems, devices,and methods disclosed herein in the context of encoding video while avideo capture device is capturing video data of aerial motion (e.g.,flight) may also be applied in the context of encoding video while avideo capture device is capturing video data of other types of motion,such as movement on the ground or on water, underwater motion, or motionin space.

Other objects and features of the present invention will become apparentby a review of the specification, claims, and appended figures.

INCORPORATION BY REFERENCE

All publications, patents, and patent applications mentioned in thisspecification are herein incorporated by reference to the same extent asif each individual publication, patent, or patent application wasspecifically and individually indicated to be incorporated by reference.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth with particularity inthe appended claims. A better understanding of the features andadvantages of the present invention will be obtained by reference to thefollowing detailed description that sets forth illustrative embodiments,in which the principles of the invention are utilized, and theaccompanying drawings of which:

FIG. 1 shows a schematic view of an unmanned aerial vehicle (UAV)carrying a video capture device that is used to capture video, inaccordance with embodiments of the invention.

FIG. 2 illustrates a general process of video encoding, in accordancewith embodiments of the present invention.

FIG. 3 illustrates a process of determining video data compression basedon movement within the video, in accordance with embodiments of theinvention.

FIG. 4 illustrates schematics of bitrate and quantization stepdistributions between video frames having different motion components,in accordance with embodiments of the invention.

FIG. 5 illustrates an optical flow field that is associated with arotating view from above for encoding a video frame, in accordance withembodiments of the invention.

FIG. 6 illustrates a global optical flow field having different degreesof object movement for encoding a video frame, in accordance withembodiments of the invention.

FIG. 7 illustrates an optical flow field that is associated withultra-fast global camera motion for encoding a video frame, inaccordance with embodiments of the invention.

FIG. 8 illustrates two video frame components, which are to be encoded,within an optical flow field that is associated with angled globalmotion, in accordance with embodiments of the invention.

FIG. 9 illustrates two video frame components, which are to be encoded,within an optical flow field that is associated with a zoom-in featurethat is associated with a camera, in accordance with embodiments of theinvention.

FIG. 10 illustrates two video frame components, which are to be encoded,within an optical flow field that is associated with a rotating viewfrom above, in accordance with embodiments of the invention.

FIG. 11 illustrates three video frame components, which are to beencoded, within a global optical flow field having different degrees ofobject movement, in accordance with embodiments of the invention.

FIG. 12 illustrates examples of intra coding of pixels within a block ina video frame component, in accordance with embodiments of theinvention.

FIG. 13 illustrates motion vectors linking co-located blocks acrossvideo frames, in accordance with embodiments of the invention.

FIG. 14 illustrates a structure of prioritizing calculation of acoefficient between frames rather than searching for a motion vector, inaccordance with embodiments of the invention.

FIG. 15 is a flow chart illustrating a method of determining aquantization step for encoding video based on motion data, in accordancewith embodiments of the invention.

FIG. 16 is a flow chart illustrating another method of determining aquantization step for encoding video based on motion data, in accordancewith embodiments of the invention.

FIG. 17 is a flow chart illustrating a method of bit allocation forencoding video based on motion data, in accordance with embodiments ofthe invention.

FIG. 18 is a flow chart illustrating another method of bit allocationfor encoding video based on motion data, in accordance with embodimentsof the invention.

FIG. 19 illustrates an appearance of UAV in accordance with embodimentsof the present invention.

FIG. 20 illustrates a movable object including a carrier and a payload,in accordance with embodiments of the present invention.

FIG. 21 is a schematic illustration by way of block diagram of a systemfor controlling a movable object, in accordance with embodiments of thepresent invention.

DETAILED DESCRIPTION

The methods, devices and terminals described herein provide effectiveapproaches for efficiently encoding video captured by video capturedevices such as UAVs. The methods, devices and terminals describedherein can be used to capture video data, generate an optical flow fieldbased on sensor data associated with the UAV, and determine quantizationsteps and/or bit allocation for encoding the video data based on thegenerated optical flow field. The methods, devices and terminalsdisclosed herein can be applied to any suitable movable object orstationery objects. A movable object may be capable of self-propelledmovement (e.g., a vehicle), while a stationary object may not be capableof self-propelled movement. In some embodiments, the movable object maybe an unmanned aerial vehicle (UAV).

In addition to providing methods that may be used to efficiently encodevideo data, methods are provided for encoding data so as to maintain aconstant bitrate (CBR) when the video is decoded. In this way, videodata that is encoded may be transmitted and processed in a way thatprovides the decoded video seamlessly to a user. Additionally, whenvideo data is more efficiently encoded, a larger amount of video datamay be recorded given a set amount of storage space. Alternatively,video that has increased capacity may be recorded within the same amountof storage space that previously would only be able to record the sameamount of general video data. These aspects are beneficial in recordinghigh-definition video, in recording video having a high degree ofmovement, and in providing video while maintaining a CBR.

The way methods of the invention are able to efficiently encode videodata, and maintain a CBR of decoded video, by efficiently allocating anamount of bits towards encoding video frame components. In particular,portions of video that have a high degree of movement may be encodedusing more bits than portions of video that have less movement.Additionally, if there are not enough bits to allocate towards encodingportions of video, the compression of the video may be modified. Inexamples, an increased quantization step may be chosen when encodingportions of a video frame so as to compress the video and use fewer bitswhen encoding the video. This, in turn, helps to maintain the amount ofbits that are allocated for encoding the video so as to maintain aconstant bitrate. In particular, when bitrate of data that is beingprocessed fluctuates, the decoder buffer may overflow when decoding thevideo. As such, controlling bitrate when initially encoding data is animportant consideration when using an encoding processor.

FIG. 1 shows a schematic view of an unmanned aerial vehicle (UAV) 100carrying a video capture device 140 that is used to capture video inaccordance with embodiments of the invention. The UAV may have a UAVbody 110 and one or more propulsion units 120 that may effect movementof the UAV. The UAV may have one or more sensors. The one or moresensors may be used to gather data that is used by the UAV to generatean optical flow field. The UAV may optionally have an on-board opticalflow field generator 130. The optical flow field that is generated bythe UAV may, in turn, be used to efficiently encode video that iscaptured by the UAV. An encoding processor 150 may optionally be carriedby the UAV and used to encode the video.

Video may be captured using a video capture device 140. The videocapture device may be supported on a stationary object or a movableobject, such as a UAV. Any description herein of a UAV may include anysupport structure for the video capture device. Any description hereinof a UAV 100 may apply to any type of movable object, such as an aerialvehicle. The description of a UAV may apply to any type of unmannedmovable object (e.g., which may traverse the air, land, water, orspace). The UAV may be capable of responding to commands from a remotecontroller. The remote controller may be not connected to the UAV, theremote controller may communicate with the UAV wirelessly from adistance. In some instances, the UAV may be capable of operatingautonomously or semi-autonomously. The UAV may be capable of following aset of pre-programmed instructions. In some instances, the UAV mayoperate semi-autonomously by responding to one or more commands from aremote controller while otherwise operating autonomously. For instance,one or more commands from a remote controller may initiate a sequence ofautonomous or semi-autonomous actions by the UAV in accordance with oneor more parameters. In some embodiments, any description herein of a UAVmay apply to any stationary object, such as a support for the videocapture device (e.g., stand, pole, fence, building, wall, ceiling, roof,floor, ground, furniture, lighting fixture, tree, plant, stone, or anyother stationary object).

The video capture device may be capable of altering a field of view(FOV) captured by the video capture device. The video capture device mayhave translational motion (e.g., side to side, front to back, up anddown, or any combination thereof) to alter the video capture device FOV.The video capture device may have rotational movement (e.g., about ayaw, pitch, or roll axis of the video capture device) to alter the videocapture device FOV. In some instances, the video capture device may onlyhave translational motion without rotational motion, may only haverotational motion without translational motion, or may have bothtranslational and rotational motion. Motion captured by video from thevideo capture device may be indicative of change of the video capturedevice FOV. The video encoding systems and methods may be used to encodethe video captured by the video capture device, as described in greaterdetail elsewhere herein.

The video capture device may optionally be supported by a UAV 100 or anyother support structure. The UAV may have a body 110. In some instances,the body may be a central body which may have one or more branchingmembers, or “arms.” The arms may extend outward from the body in aradial manner and be joined via the body. The number of arms may matchthe number of propulsion units, or rotors, of the UAV. The body maycomprise a housing. The housing may enclose one or more components ofthe UAV within the housing. In some instances, one or more electricalcomponents of the UAV may be provided within the housing. For example, aflight controller of the UAV may be provided within the housing. Theflight controller may control operation of one or more propulsion units120 of the UAV. The propulsion units may each include the rotors and/ormotors. Additionally, the one or more propulsion units may permit theUAV to move about in the air. The one or more propulsion units may beprovided on an arm of the UAV. The arm may be connected to a body of theUAV on a proximal end of the arm. One or more propulsion units may beconnected to a distal end of the arm. The one or more propulsion unitsmay enable the UAV to move about one or more, two or more, three ormore, four or more, five or more, six or more degrees of freedom. Insome instances, the UAV may be able to rotate about one, two, three ormore axes of rotation. The axes of rotation may be orthogonal to oneanother. The axes of rotation may remain orthogonal to one anotherthroughout the course of the UAV's flight. The axes of rotation mayinclude a pitch axis, roll axis, and/or yaw axis. The UAV may be able tomove along one or more dimensions. For example, the UAV may be able tomove upwards due to the lift generated by one or more rotors. In someinstances, the UAV may be capable of moving along a Z axis (which may beup relative to the UAV orientation), an X axis, and/or a Y axis (whichmay be lateral). The UAV may be capable of moving along one, two, orthree axes that may be orthogonal to one another.

The UAV may be a rotorcraft. In some instances, the UAV may be amulti-rotor craft that may include a plurality of rotors. The pluralityof rotors may be capable of rotating to generate lift for the UAV. Therotors may be propulsion units that may enable the UAV to move aboutfreely through the air. The rotors may rotate at the same rate and/ormay generate the same amount of lift or thrust. The rotors mayoptionally rotate at varying rates, which may generate different amountsof lift or thrust and/or permit the UAV to rotate. In some instances,one, two, three, four, five, six, seven, eight, nine, ten, or morerotors may be provided on a UAV. The rotors may be arranged so thattheir axes of rotation are parallel to one another. In some instances,the rotors may have axes of rotation that are at any angle relative toone another, which may affect the motion of the UAV.

The UAV shown may have a plurality of rotors. The rotors may connect tothe body of the UAV which may comprise a control unit, one or moresensors, a processor, and a power source. The sensors may include visionsensors and/or other sensors that may collect information about the UAVenvironment. The information from the sensors may be used to determine alocation of the UAV. The rotors may be connected to the body via one ormore arms or extensions that may branch from a central portion of thebody. For example, one or more arms may extend radially from a centralbody of the UAV, and may have rotors at or near the ends of the arms.

A vertical position and/or velocity of the UAV may be controlled bymaintaining and/or adjusting output to one or more propulsion units ofthe UAV. For example, increasing the speed of rotation of one or morerotors of the UAV may aid in causing the UAV to increase in altitude orincrease in altitude at a faster rate. Increasing the speed of rotationof the one or more rotors may increase the thrust of the rotors.Decreasing the speed of rotation of one or more rotors of the UAV mayaid in causing the UAV to decrease in altitude or decrease in altitudeat a faster rate. Decreasing the speed of rotation of the one or morerotors may decrease the thrust of the one or more rotors. When a UAV istaking off, the output provided to the propulsion units may be increasedfrom its previous landed state. When the UAV is landing, the outputprovided to the propulsion units may be decreased from its previousflight state. The UAV may be configured to take off and/or land in asubstantially vertical manner.

A lateral position and/or velocity of the UAV may be controlled bymaintaining and/or adjusting output to one or more propulsion units ofthe UAV. The altitude of the UAV and the speed of rotation of one ormore rotors of the UAV may affect the lateral movement of the UAV. Forexample, the UAV may be tilted in a particular direction to move in thatdirection and the speed of the rotors of the UAV may affect the speed ofthe lateral movement and/or trajectory of movement. Lateral positionand/or velocity of the UAV may be controlled by varying or maintainingthe speed of rotation of one or more rotors of the UAV.

The arms of the UAV may be tubes or rods. The arms of the UAV may have acircular cross section. The arms of the UAV may have a square orrectangular cross section. The arms of the UAV may have an ellipticcross section. The arms of the UAV may be hollow tubes. The arms of theUAV may be solid tubes. The arms of the UAV may be formed from ametallic, plastic, or composite material. The arms of the UAV may beformed from a lightweight material. The arms of the UAV may be formedfrom carbon fiber. The arms of the UAV may be integrally formed with thecentral body of the UAV. Alternatively, the arms of the UAV may beseparately formed or may be separable from the UAV.

The UAV may have a greatest dimension (e.g., length, width, height,diagonal, diameter) of no more than 100 cm. In some instances, thegreatest dimension may be less than or equal to 1 mm, 5 mm, 1 cm, 3 cm,5 cm, 10 cm, 12 cm, 15 cm, 20 cm, 25 cm, 30 cm, 35 cm, 40 cm, 45 cm, 50cm, 55 cm, 60 cm, 65 cm, 70 cm, 75 cm, 80 cm, 85 cm, 90 cm, 95 cm, 100cm, 110 cm, 120 cm, 130 cm, 140 cm, 150 cm, 160 cm, 170 cm, 180 cm, 190cm, 200 cm, 220 cm, 250 cm, or 300 cm. Optionally, the greatestdimension of the UAV may be greater than or equal to any of the valuesdescribed herein. The UAV may have a greatest dimension falling within arange between any two of the values described herein. The UAV may belightweight UAV. For example, the UAV may weigh less than or equal to 1mg, 5 mg, 10 mg, 50 mg, 100 mg, 500 mg, 1 g, 2 g, 3 g, 5 g, 7 g, 10 g,12 g, 15 g, 20 g, 25 g, 30 g, 35 g, 40 g, 45 g, 50 g, 60 g, 70 g, 80 g,90 g, 100 g, 120 g, 150 g, 200 g, 250 g, 300 g, 350 g, 400 g, 450 g, 500g, 600 g, 700 g, 800 g, 900 g, 1 kg, 1.1 kg, 1.2 kg, 1.3 kg, 1.4 kg, 1.5kg, 1.7 kg, 2 kg, 2.2 kg, 2.5 kg, 3 kg, 3.5 kg, 4 kg, 4.5 kg, 5 kg, 5.5kg, 6 kg, 6.5 kg, 7 kg, 7.5 kg, 8 kg, 8.5 kg, 9 kg, 9.5 kg, 10 kg, 11kg, 12 kg, 13 kg, 14 kg, 15 kg, 17 kg, or 20 kg. The UAV may have aweight greater than or equal to any of the values described herein. TheUAV may have a weight falling within a range between any two of thevalues described herein.

The UAV may carry the video capture device 140. The video capture devicemay be supported by any support structure, moving (e.g., UAV) orstationary. In some embodiments, the video capture device may be acamera. Any description herein of a camera may apply to any type ofvideo capture device. The camera may be rigidly coupled to the supportstructure. Alternatively, the camera may be permitted to move relativeto the support structure with respect to up to six degrees of freedom.The camera may be directly mounted onto the support structure, orcoupled to a carrier mounted onto the support structure. In someembodiments, the carrier may be a gimbal. In some embodiments, thecamera may be an element of a payload of the support structure, such asa UAV.

The camera may capture images (e.g., dynamic images such as video, orstill images such as snapshots) of an environment of the UAV. The cameramay continuously capture images (e.g., video). Alternatively, the cameramay capture images (e.g., video) at a specified frequency to produce aseries of image data (e.g., video data) over time. Any descriptionherein of video may apply to any type of images, such as dynamic orstill images, such as a series of images captured over time. Images maybe captured at a video rate (e.g., 25, 50, 75, 100, 150, 200, or 250Hz). In some embodiments, the video may be captured simultaneously witha recording of environment audio.

In some embodiments, the captured video may be stored in a memoryon-board the UAV. The memory may be a non-transitory computer readablemedium that may include one or more memory units (e.g., removable mediaor external storage such as a Secure Digital (SD) card, or a randomaccess memory (RAM), or a read only memory (ROM) or a flash memory).Alternatively, the captured video and/or images may be transmitted to aremote terminal. The transmission of captured video and/or images may beimplemented over a wireless link, including but not limited to, a radiofrequency (RF) link, a Wi-Fi link, a blue tooth link, a 2G link, a 3Glink, or a LTE link. The memory may be on the camera carried by the UAV,on a carrier of the UAV, and/or on the UAV itself (e.g., within the UAVbody or an arm of the UAV). The memory may or may not be removable orseparable from the UAV, carrier, or camera.

The camera may comprise an image sensor and one or more lenses. The oneor more lenses may be configured to direct light to the image sensor. Animage sensor is a device that converts an optical image into anelectronic signal. The image sensor of the camera may be acharge-coupled device (CCD) type, a complementarymetal-oxide-semiconductor (CMOS) type, a N-typemetal-oxide-semiconductor (NMOS) type, or a back-side illuminated CMOS(BSI-CMOS) type.

The camera may have a focal length or focal length range. A focal lengthof an optical system may be a measure of how strongly the systemconverges or diverges light. The focal length that is associated withthe camera may influence a resulting optical flow field that isgenerated using video that is captured by the camera. The focal lengthof a lens may be the distance over which initially collimated rays arebrought to a focus. The camera may have any type of lens, such as aprime lens or a zoom lens. A prime lens may have a fixed focal lengthand the focal length may encompass a single focal length. A zoom lensmay have variable focal lengths and the focal length may encompass aplurality of focal lengths.

The video capture device may have a FOV that may change over time. Thefield of view (FOV) may be a part of the world that is visible throughthe camera at a particular position and orientation in space; objectsoutside the FOV when the picture is taken are not recorded in the videodata. It is most often expressed as the angular size of the view cone,as an angle of view. For normal lens, field of view may be calculated asFOV=2 arctan(d/2f), where d is image sensor size, and f is focal lengthof the lens. For an image sensor having a fixed size, the prime lens mayhave a fixed FOV and the FOV may encompass a single FOV angle. For animage sensor having a fixed size, the zoom lens may have variable FOVangular range and the FOV angular range may encompass a plurality of FOVangles. The size and/or location of the FOV may change. The FOV of thevideo capture device may be altered to increase or decrease the size ofthe FOV (e.g., zooming in or out), and/or to change a centerpoint of theFOV (e.g., moving the video capture device translationally and/orrotationally). Alteration of the FOV may result in motion within thevideo.

Data from sensors associated with a camera may be used to aid ingenerating an optical flow field, useful for encoding video datacaptured by the camera. The sensors associated with the camera may beon-board the camera, the support structure for the camera (e.g., UAV),and/or a carrier that supports the camera on the support structure(e.g., gimbal). Alternatively, the sensors associated with the cameramay be remote from the camera, the carrier, and/or the support structurefor the camera.

For instance, a support structure of the camera may support one or moresensors. In examples, the support structure may be a UAV. Anydescription of the sensors of the UAV may apply to any type of supportstructure for the camera. The UAV may comprise one or more visionsensors such as an image sensor. For example, an image sensor may be amonocular camera, stereo vision camera, radar, sonar, or an infraredcamera. The UAV may further comprise other sensors that may be used todetermine a location of the UAV, or may be useful for generating opticalflow field information, such as global positioning system (GPS) sensors,inertial sensors which may be used as part of or separately from aninertial measurement unit (IMU) (e.g., accelerometers, gyroscopes,magnetometers), lidar, ultrasonic sensors, acoustic sensors, WiFisensors. The UAV may have sensor on-board on-board the UAV that collectinformation directly from an environment without contacting anadditional component off-board the UAV for additional information orprocessing. For example, a sensor that collects data directly in anenvironment may be a vision or audio sensor.

Alternatively, the UAV may have sensors that are on-board the UAV butcontact one or more components off-board the UAV to collect data aboutan environment. For example, a sensor that contacts a componentoff-board the UAV to collect data about an environment may be a GPSsensor or another sensor that relies on connection to another device,such as a satellite, tower, router, server, or other external device.Various examples of sensors may include, but are not limited to,location sensors (e.g., global positioning system (GPS) sensors, mobiledevice transmitters enabling location triangulation), vision sensors(e.g., imaging devices capable of detecting visible, infrared, orultraviolet light, such as cameras), proximity or range sensors (e.g.,ultrasonic sensors, lidar, time-of-flight or depth cameras), inertialsensors (e.g., accelerometers, gyroscopes, inertial measurement units(IMUs)), altitude sensors, attitude sensors (e.g., compasses) pressuresensors (e.g., barometers), audio sensors (e.g., microphones) or fieldsensors (e.g., magnetometers, electromagnetic sensors). Any suitablenumber and combination of sensors may be used, such as one, two, three,four, five, or more sensors. Optionally, the data may be received fromsensors of different types (e.g., two, three, four, five, or moretypes). Sensors of different types may measure different types ofsignals or information (e.g., position, orientation, velocity,acceleration, proximity, pressure, etc.) and/or utilize different typesof measurement techniques to obtain data.

Any of these sensors may also be provided off-board the UAV. The sensorsmay be associated with the UAV. For instance, the sensors may detectcharacteristics of the UAV such as position of the UAV, speed of theUAV, acceleration of the UAV, orientation of the UAV, noise generated bythe UAV, light emitted or reflected from the UAV, heat generated by theUAV, or any other characteristic of the UAV. The sensors may collectdata that may be used alone or in combination with sensor data fromsensors on-board the UAV to generate optical flow field information.

The sensors may include any suitable combination of active sensors(e.g., sensors that generate and measure energy from their own energysource) and passive sensors (e.g., sensors that detect availableenergy). As another example, some sensors may generate absolutemeasurement data that is provided in terms of a global coordinate system(e.g., position data provided by a GPS sensor, attitude data provided bya compass or magnetometer), while other sensors may generate relativemeasurement data that is provided in terms of a local coordinate system(e.g., relative angular velocity provided by a gyroscope; relativetranslational acceleration provided by an accelerometer; relativeattitude information provided by a vision sensor; relative distanceinformation provided by an ultrasonic sensor, lidar, or time-of-flightcamera). The sensors on-board or off-board the UAV may collectinformation such as location of the UAV, location of other objects,orientation of the UAV 100, or environmental information. A singlesensor may be able to collect a complete set of information in anenvironment or a group of sensors may work together to collect acomplete set of information in an environment. Sensors may be used formapping of a location, navigation between locations, detection ofobstacles, or detection of a target. Additionally, and in accordancewith the invention, the sensors may be used to gather data which is usedto generate an optical flow field that is used to efficiently encodevideo data captured by the UAV.

Accordingly, the UAV may also have an optical flow field generator 130.The optical flow field generator may be provided on-board the UAV (e.g.,in the UAV body or arm, on the camera, or on the carrier).Alternatively, the optical flow field generated may be providedoff-board the UAV (e.g., at a remove server, cloud computinginfrastructure, remote terminal, or ground station). The optical flowfield generator may have one or more processors that are individually orcollectively configured to generate an optical flow field based onsensor data that is associated with the UAV. An optical flow fielddemonstrates how light flows within video frames. This flow of lightindicates how captured objects are moving between video frames. Inparticular, the optical flow field is able to describe characteristicsof how objects that are captured by a video capturing device are moving,including direction and speed of the moving objects. For instance, thevideo captured within the FOV of the video capturing device may includeone or more stationary or movable objects. In examples, the optical flowfield may be used to determine speeds or accelerations of objects thatare moving in video. The optical flow field may also be used todemonstrate directions of movement of objects that are within the video.Examples of optical flow fields that describe objects moving within avideo are described below with respect to FIGS. 5 to 11.

The sensor data that is used to generate the optical flow field may beobtained by the one or more sensors associated with the UAV.Additionally or alternatively, the sensor data may be obtained by anexternal source, such as an external monitoring system. The externalsensor data may be provided to the UAV using a communication channel.Accordingly, the optical flow field may be generated at the UAV.Alternatively, an optical flow field may be generated external to theUAV. In particular, the UAV may provide sensor information that isassociated with the UAV to one or more external processors. The one ormore external processors may then use the sensor data that is associatedwith the UAV to generate an optical flow field. Further, the one or moreexternal processors may provide the optical flow field that is generatedto the UAV. The optical flow field generator, whether on-board oroff-board the UAV, may receive data from sensors associated with the UAV(whether the sensors are on-board, off-board, or any combinationthereof), which may be used to generate an optical flow field.

The sensor data may optionally include information about the spatialdisposition of the camera (e.g., coordinates, translational position,height, orientation), or movement of the camera (e.g., linear speed,angular speed, linear acceleration, angular acceleration). The sensordata may be able to detect a zoom state of the camera (e.g., focallength, how far zoomed in or out). The sensor data may be useful forcalculating how a FOV of the camera may change.

An encoding processor 150 may be provided in accordance with embodimentsof the invention. The encoding processor may be used to encode videothat is captured by the video capture device. Examples of entropy codingtools include Huffman coding, run-level coding, and arithmetic coding.In examples discussed herein, CAVLC and CABAC may be used in H264.

Additionally, the encoding processor may use an optical flow field thatis associated with the video. The optical flow field may be used toefficiently encode the video. The video may comprise video framecomponents. Video frame components may comprise a video frame.Alternatively, video frame components may comprise portions of a videoframe, such as blocks. Blocks may have a shape such as a circle, square,octagon, triangle, or other shapes. Additionally, blocks within a videoframe may include more than one shape.

The encoding processor may receive the optical flow field informationand use the optical flow field information to encode the video. Inexamples, the encoding processor may use the optical flow fieldinformation to allocate bits for the encoding of video frame components.In particular, the encoding processor may allocate more bits to areashaving more movement so as to capture distinctions between video framesin the encoding process. Additionally, the encoding processor may usethe optical flow field information to select quantization steps for theencoding of video frame components. In particular, the encodingprocessor may select higher quantization steps for encoding video framecomponents that have a high degree of motion. Alternatively, theencoding processor may select lower quantization steps for encodingvideo frame components that are substantially similar. In examples, theencoding processor may select a low quantization step for encoding videoframe components that are essentially identical.

The encoding processor may include one or more processors that mayencode the video. The encoding processor may be separate from theoptical flow field generator, or may be the within the same component asthe optical flow field generator. The encoding processor may include oneor more processors that do not overlap with one or more processors ofthe optical flow field generator. Alternatively, one or more processorsof the encoding processor may be the same as one or more processors ofthe optical flow field generator. In some instances, all processors ofthe encoding processor may be the same as the processors of the opticalflow field generator.

The encoding processor may optionally be provided on-board the UAV. Forinstance, the encoding processor may be within the UAV body or arm, maybe on-board the camera, or may be on-board a carrier supporting thecamera. Alternatively, the encoding processor may be provided off-boardthe UAV. For instance, the encoding processor may be provided at aremote server, cloud computing infrastructure, remote terminal, orground station. The encoding processor may be provided at a same ordifferent location from the optical flow field generator.

FIG. 2 illustrates a general process 200 of video encoding, inaccordance with embodiments of the invention. When encoding video data,video frames of the video data may be initially split into blocks 202.These blocks may then be compressed based on intra frame data and/orinter frame data. Intra fame data is directed towards the spatialrelationship between blocks within a frame. Conversely, inter frame datais directed towards the temporal relationship between blocks acrossvideo frames. Additionally, the bit consumption of an intra coded frameis more than five times the bit cost by an inter coded frame acrosstemporally related frames when the reconstructed pictures are of thesame quality. Additionally, when there is a high degree of motion withinvideo frames, such as video frames that have some objects moving quicklyacross a series of video frames and other objects moving in and out ofthe video frames, the bit cost of the inter coding of temporally relatedframes may significantly increase.

As seen in FIG. 2, an input video signal is received. The input videosignal may be received from a video capture device. The video capturedevice may be supported by a support structure, such as a UAV.Additionally or alternatively, the input video signal may be receivedfrom an external device off-board the UAV. The received video may besplit into macroblocks 202. Macroblocks may or may not have anyoverlapping portions. The video may be split into any number ofmacroblocks. For instance, the video may be split into an array of m×nmacroblocks, where m has a value of 1 or more, 2 or more, 3 or more, 4or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 ormore, 12 or more, 15 or more, 16 or more, 18 or more, 20 or more, 25 ormore, 30 or more, 40 or more, 50 or more, 60 or more, 70 or more, 80 ormore, 90 or more, 100 or more, 120 or more, 150 or more, 200 or more,250 or more, or 300 or more, and n has a value of 1 or more, 2 or more,3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 ormore, 10 or more, 12 or more, 15 or more, 16 or more, 18 or more, 20 ormore, 25 or more, 30 or more, 40 or more, 50 or more, 60 or more, 70 ormore, 80 or more, 90 or more, 100 or more, 120 or more, 150 or more, 200or more, 250 or more, or 300 or more. The macroblock may have arectangular shape, square shape, circular shape, or any other shape. Inone embodiment, a macroblock may have a dimension of 16×16 pixels. Themacroblock may have any dimension, such as p×q pixels, where where p hasa value of 1 or more, 2 or more, 3 or more, 4 or more, 5 or more, 6 ormore, 7 or more, 8 or more, 9 or more, 10 or more, 12 or more, 15 ormore, 16 or more, 18 or more, 20 or more, 25 or more, 30 or more, 32 ormore, 40 or more, 50 or more, 60 or more, 64 or more, 70 or more, 80 ormore, 90 or more, 100 or more, 120 or more, 128 or more, 150 or more,200 or more, 250 or more, 256 or more, or 300 or more, and q has a valueof 1 or more, 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7or more, 8 or more, 9 or more, 10 or more, 12 or more, 15 or more, 16 ormore, 18 or more, 20 or more, 25 or more, 30 or more, 32 or more, 40 ormore, 50 or more, 60 or more, 64 or more, 70 or more, 80 or more, 90 ormore, 100 or more, 120 or more, 128 or more, 150 or more, 200 or more,250 or more, 256 or more, or 300 or more. In the modern video codingstandard, a video frame having a resolution of 720P or 1080P may beencoded by first dividing the video frame into small blocks. For H.264,the block size may be 16×16 pixels and for HEVC, the block size may be64×64. Each macroblock may have the same dimension and/or shape. Inexamples, a macroblock may be a square, rectangle, circle, triangle,trapezoid, rhombus, oval, or another shape. Alternatively, two or moremacroblocks may have differing dimensions and/or shapes. A macroblockmay also be referred to as a ‘block.’

An encoding processor may be used to remove the correlation of theblocks spatially and/or temporally. As such, after a video frame isdivided into small blocks, the blocks of video data may go through avideo encoding architecture as provided in FIG. 2.

In particular, the video data may proceed to a coder control 204. Thecoder control may be used to determine whether to encode the video datadirectly, e.g. without any additional transformation steps, or whetherto send the data to a transformation/scaling/quantization (TSQ)component. In examples, the coder control may pass the video datadirectly to an entropy coding component 206. In other examples, thecoder control may pass the video data to a TSQ component 208 prior toproviding the transformed data to the entropy coding component. At theTSQ component, the video data may be transformed so as to compresssimilarities between spatially and temporally related video framecomponents, such as blocks. This process may use video from the originalinput video signal. Additionally, this process may utilize previouslyencoded video data so as to make the transformation process moreefficient. Additionally, this compression process may result inquantization and transformation coefficients 210 which may then beprovided to the entropy encoding component. Coefficients may becalculated based on discrete cosine transforms (DCT) and may be used torepresent differences between video frame components such as videoframes or blocks within a video frame.

When transforming the video data, the video data may be processed inview of previously transformed video data that is re-evaluated atdecoder 212 and that is provided as feedback to the TSQ component. Inparticular, video compression feedback may be generated by providingtransformed video data from the TSQ component to scaling and inversiontransformation (SIT) component 214. At the SIT component, thetransformation process of the video data may be reversed. This videodata may then be provided to a de-blocking filter 216 which may be usedto generate an output video signal 218. The output video signal may thenbe used as a component to generate motion compensation factors at motioncompensation component 220.

In examples, the motion compensation component may use motion data froman output video signal as well as motion data that is generated frommotion estimation component 222. In particular, the motion estimationcomponent may receive input video data from the initial input videosignal. The motion estimation component may then generate motion databased on the video data. This motion data may then be provided to themotion compensation component and the entropy coding component.

Once the decoded video data is provided and contextualized based onmotion data from the motion compensation component, the video data maybe evaluated for intra frame prediction using intra-frame predictioncomponent 224. Additional predictions may also be generated forinter-frame predictions. These predications may be provided as feedbackfor both the TSQ component as well as the de-blocking filter. As such,the quantization and transformation coefficients that are generated fromthe TSQ component, as well as the output signal that is generated by thede-blocking filter, may be refined based on feedback from processedvideo data.

As such, a video encoder may be used to simplify duplicate information,both between blocks of different video frames (temporal compression) aswell as between blocks within the same video frame (spatialcompression), so as to condense information. Once the video data iscondensed, the video frames that are encoded utilizing the architecturein FIG. 2 may be formed into a 1-D bitstream.

FIG. 3 illustrates a process 300 of determining video data compressionof a video frame component based on movement within the video, inaccordance with embodiments of the invention. At step 310, an encodingcost for encoding a video frame component is determined based on analgorithm of a rate distortion optimization. Rate distortionoptimization is an optimization of parameters that are modified so as toprovide a particular bitrate and distortion of restructured videoframes. The rate distortion optimization can be determined using motioninformation, block size information, and block coefficient information.The encoding cost may be a range of bits that may be allocated to encodea particular video frame component. In particular, the encoding cost isdetermined by assessing parameters of the rate distortion optimizationand determining the bits that may be used for encoding so as to ensurethat the bitrate of reconstructed frames is within a CBR. In embodimentsdiscussed herein, an encoding cost may be provided, and methods providedherein may be used to allocate bits and/or select quantization steps soas to efficiently encode video data within the parameters of theprovided encoding cost.

At step 320, motion information associated with the video framecomponent is received. In examples, the motion information may be basedon an optical flow field that is associated with the video framecomponent. The motion information may include motion data that isassociated with the video frame component. Additionally, the motioninformation may include motion data that is associated with video framecomponents that are adjacent to the video frame component. Additionally,optical flow fields may include motion data that is generated bymovement of a video capture device and/or movement of a UAV. Motion datamay include translational and/or rotational movement. In examples,motion data may be generated by rotating a video capture device about aroll axis. Motion data may also be generated by rotating a UAV aboutcamera roll axis. In examples, motion data may be generated by moving avideo capture device and/or UAV about other axes, such as pitch and yaw.Further, motion data may be generated by moving the video capture deviceand/or UAV in a sideways, upwards, downwards, zoom-in, zoom-out, ordiagonal motion, or a combination thereof In additional examples,generated optical flow fields may include motion aspects related to thespeed of moving objects, distance of moving objects from a video capturedevice, curving motion of moving objects, directionality of movingobjects, and other characteristics of object movement within an opticalflow field.

At step 330, at least one portion of the video frame component isassessed against a threshold amount of motion. In examples, a portion ofa video frame component that is determined to have more than a thresholdamount of motion may be assessed as having a high degree of motion.Additionally, a portion of a video frame component that is determined tohave less than a threshold amount of motion may be assessed as having alow degree of motion. Further, a portion of the video frame componentthat does not have a high degree or low degree of motion may bedetermined to have a normal degree of motion.

At step 340, bits are allocated to at least one portion of the videoframe component based on the motion data. In some instances, this mayinclude allocating the bits based on threshold motion assessments. Inparticular, a standard bit amount may be allocated to at least oneportion of the video frame component that is determined to have a normaldegree of motion. Additionally, an augmented bit amount may be allocatedto at least one portion of the video frame component that is determinedto have a high degree of motion. Further, a lesser bit amount may beallocated to at least one portion of the video frame component that isdetermined to have a low degree of motion. For instance, a portion ofthe video frame component having a higher degree of motion may receive ahigher bit allocation than a portion of the video frame component havinga lower degree of motion. By allocating a higher bit allocation forencoding a portion of the video frame component having a higher degreeof motion, the differences between video frames may be more accuratelyreflected. In particular, video having a high degree of motion may havemore objects moving in and out of the video frames than video having alower degree of motion. As such, more bits may be allocated to encodethese differences.

While an augmented bit amount, when available, may be allocated to theat least one portion of the video frame component that is determined tohave a high degree of motion, there are examples where a bit amount maybe limited. In these examples, an encoding processor may choose to use aquantization step to compress video data. Quantization is a lossycompression technique that is achieved by compressing two or more valuesto a single quantum value. In image processing, quantization may beespecially useful in compressing differences between frequencies ofbrightness variations that are not easily distinguishable by the humaneye. For example, the human eye may be good at perceiving differences ofbrightness across large frequencies, but may not be able to distinguishvarying frequencies that are cumulatively less than a perceptiblethreshold of difference. Accordingly, video data may be compressed bytaking frequencies within the video data that are associated withbrightness, dividing the frequencies by a standard value, and then roundthe resulting calculations of frequency up (or down) to the nearestinteger. So long as the variation of frequencies is still beneath thethreshold of human perception of differences between frequencies, a userwatching the reconstructed video may not even be aware of thedistinctions between the original video data and the modified videodata. However, the ability to reference a smaller range of frequenciesthan the range originally captured may allow the video data to becompressed to an amount of bits that is consistent with the encodingcost associated with a CBR for providing reconstructed video.

In addition to choosing to perform a quantization step on data within avideo frame component, an encoding processor may also choose a degree ofquantization that is used. In particular, the degree of quantizationrefers to the magnitude of the standard value that is used to divide aset of data, such as the brightness frequencies discussed above. As thestandard value that is used to divide data increases, the amount ofcompression may also be increased. As such, the standard value and thedegree of compression may be directly proportional. In examples, thestandard value and the degree of compression may be directly linearlyproportional.

At step 350, a determination is made as to whether a quantization stepis needed to compress the video frame component. This determination maybe made based on the provided encoding cost as well as the degree ofmotion within the video frame component. In particular, if there is ahigh degree of motion associated with at least one portion of the videoframe component, but there are not bits available to allocate to the atleast one portion of the video frame component having a high degree ofmotion, a determination may be made to select a quantization step forthat at least on portion of the video frame component. Additionally, thedegree of quantization that may be used may be calculated during thedetermination step 350. In particular, the degree of quantization stillmay be calculated based on the encoding cost of the video framecomponent and the amount of data that needs to be reduced so as toensure the reconfigured frames will be within a CBR.

Additionally, at step 360, a quantization step is determined for a leastone portion of the video frame component. In particular, the selectedquantization step may be based on the size of the at least one portionof the video frame component. The selected quantization step may also bebased on the motion information within the at least one portion of thevideo frame component. Further, the selected quantization step may bebased on the block coefficient information associated with the at leastone portion of the video frame component.

Accordingly, video frame components may be encoded so as to stay withinthe threshold of encoding cost associated with a CBR of reconstructedvideo. In particular, the video frame components may be encoded by anencoding processor to stay within the encoding costs by using bitallocation and/or quantization step selection. As video frame componentsmay have varying degrees of motion, however, the degree to which anencoding processor uses bit allocation versus quantization stepselection may also vary based upon motion within the video framecomponents. In particular, when encoding video frame components, aparticular bit allocation and/or quantization step may be selected toencode the video frame components based on motion within the video framecomponents. In examples, the particular bit allocation and/orquantization step that is selected may be based on a threshold ofencoding cost associated with encoding the video frame components so asto maintain a CBR when the encoded video is decoded.

In order to illustrate this variance across video frame components, FIG.4 illustrates schematics 400 of bitrate and quantization stepdistributions between video frame components having different motioncomponents, in accordance with embodiments of the invention. Inparticular, distributions 410-430 illustrate bit allocation and/orquantization step selection on a frame-by-frame basis and distributions440-470 illustrate bit allocation and/or quantization step selection ona block-by-block basis.

As seen in FIG. 4, distribution 410 illustrates an increase in bitallocation. Bit allocation may be increased so as to increase the amountof bits that are allocated for encoding a portion of a video framecomponent. Bits may be increased on a sliding scale based on the amountof movement within a video frame component. Bits may be increased basedon categories associated with amounts of bits that are allocated toencoding video frame components. In particular, distribution 410illustrates an increase in bit allocation across a video frame. Bitallocation may be increased when a video frame includes more than athreshold amount of motion. When more than a threshold amount of motionis present, more bits may be allocated to encode areas having a greateramount of movement so that the movement may be accurately encoded. Inparticular, an encoding processor may increase bit allocation when avideo frame includes portions that have more than a threshold amount ofmotion. An example of an optical flow field that may be associated witha distribution similar to distribution 410 is provided in FIG. 5.

Additionally, distribution 420 illustrates a decrease in bit allocation.Bit allocation may be decreased so as to decrease the amount of bitsthat are allocated for encoding a portion of a video frame component.Bits may be decreased on a sliding scale based on the amount of movementwithin a video frame component. Bits may be decreased based oncategories associated with amounts of bits that are allocated toencoding video frame components. In particular, distribution 420illustrates a decrease in bit allocation across a video frame. Bitallocation may be decreased when a video frame has less than a thresholdamount of motion. In particular, in examples where video frames aresubstantially similar, fewer bits may be needed to accurately representthe differences between the similar frames. An example of an opticalflow field that may be associated with a distribution similar todistribution 420 is provided in FIG. 6. Further, distribution 430illustrates an increase in quantization step. In particular,quantization steps may have different categories of, for example, low,medium, or high quantization. The degree of quantization may beobjectively or relatively assessed in view of the different quantizationcategories. A quantization step may be increased when there is more thana threshold amount of movement in a video frame, and when there are notsufficient bits to allocate to the encoding of the movement in the videoframe. As such, an encoding processor may determine areas of a videoframe having more than a threshold amount of motion and may assesswhether there are sufficient bits to allocate to these areas.

If there are not sufficient bits, the encoding processor may increase aquantization step so as to encode the video while maintaining a CBR whenthe video data is decoded. In particular, distribution 430 illustratesan increase in quantization step across a video frame. A quantizationstep may be increased so as to increase the degree of compression ofvideo frame components, thereby decreasing the amount of bits that areused for encoding video frame components. Quantization steps may beincreased on a sliding scale based on the amount of movement within avideo frame component. Quantization steps may be increased based oncategories associated with an amount of movement within encoding videoframe components. An example of an optical flow field that may beassociated with a distribution similar to distribution 430 is providedin FIG. 7.

While quantization steps may be increased as demonstrated indistribution 430, quantization steps may also be decreased. Aquantization step may be decreased so as to decrease the degree ofcompression of video frame components. It may be beneficial to decreasea quantization step when there are sufficient bits to allocate towardsencoding the video frame component. In particular, quantization may belossy, thereby potentially creating errors when encoding video framecomponents. Quantization steps may be decreased on a sliding scale basedon the amount of movement within a video frame component. Quantizationsteps may be decreased based on categories associated with an amount ofmovement within encoding video frame components. Additionally,quantization steps may be decreased when motion within a video framefalls below a threshold associated with a particular quantization stepand when there are sufficient bits to allocate to encoding a video framecomponent within the video frame.

Additionally, as seen in FIG. 4, distribution 440 illustrates astandardized bit allocation. In particular, distribution 440 illustratesa standardized bit allocation across blocks within a video frame. Thisis illustrated in distribution 440 as Block 1 and Block 2 being the samesize to indicate that they are allocated the same amount of bits. Anexample of an optical flow field that may be associated with adistribution similar to distribution 440 is provided in FIG. 8. Further,distribution 450 illustrates an uneven bit allocation. In particular,distribution 450 illustrates an uneven bit allocation across blockswithin a video frame. This is illustrated in distribution 450 as Block 1is larger than Block 2, indicating that more bits are allocated to Block1 than Block 2. An example of an optical flow field that may beassociated with a distribution similar to distribution 450 is providedin FIG. 9.

Also as seen in FIG. 4, distribution 460 illustrates an uneven mutualaugment bit allocation. In particular, distribution 460 illustrates anuneven mutual augment bit allocation across blocks within a video frame.This is illustrated in distribution 460 as Block 1 and Block 2 are bothallocated more bits than standardized allocations that are provided inBlocks 1 and 2 of distribution 440. In contrast to standardizedallocation of bits, distribution 460 provides that Block 2 is allocatedan augmented amount of bits, and Block 1 is allocated more bits thanBlock 2. An example of an optical flow field that may be associated witha distribution similar to distribution 460 is provided in FIG. 10.Additionally, distribution 470 illustrates a multiple category bitallocation. In particular, distribution 470 illustrates a multiplecategory bit allocation across blocks within a video frame. This isillustrated in distribution 470 as Block 1 is allocated an augmentedamount of bits; Block 2 is allocated a standardized amount of bits; andBlock 3 is allocated a decreased amount of bits. An example of anoptical flow field that may be associated with a distribution similar todistribution 470 is provided in FIG. 11.

Examples of video frames that may have differing degrees of bitallocation versus quantization step selection, given constant encodingcost per video frame, are provided in FIGS. 5-7. In examples, an opticalflow field may be provided to contextualize video data that is beingencoded by an encoding processor. An optical flow field may be generatedbased on image data. Additionally or alternatively, an optical flowfield may be generated based on sensor data. In examples, the opticalflow field may be generated using an optical flow field generated asdiscussed in FIG. 1. In some examples, the optical flow field can helpto contextualize the video data so as to help the encoding processorencode video data on a frame-by-frame basis. In particular, on aframe-by-frame basis, the encoding processor may allocate more bits to aframe when an optical flow field associated with that frame indicatesthat the objects on the video frame are moving very fast. In exampleswhere there are not bits available to allocate to video frames having ahigh amount of motion, the encoding processor may instead choose aquantization step (or a higher quantization step) so as to counteractthe increase of the bitrate that would otherwise be caused by the highdegree of motion within the video frame. Additionally, the encodingprocessor may decrease the number of bits allocated to a video frame ifa large portion of the video frame is relatively still. Instead, theencoding processor may provide the bit allocation to another video framethat may have a high degree of motion.

FIG. 5 illustrates an optical flow field 500 that is associated with arotating view from above for encoding a video frame, in accordance withembodiments of the invention. While optical flow field 500 is fromabove, other methods that are used to roll about an optical axis of acamera may also be used to generate a rotating view. Motion within theoptical flow field is indicated using arrows. The length of the arrowsindicates the amount of movement that is occurring across the opticalflow field, and the curve of the arrows indicates the direction ofmovement that is occurring across the optical flow field. In examples,the video frame of FIG. 5 may have a relatively normal amount of motionthroughout the video frame. While central portions of the video frame inFIG. 5 may be allocated slightly augmented bit allocations, given thatmotion in that area is dense, peripheral portions of the video frameillustrated in FIG. 5 may each be allocated a standard bit amount, giventhat motion in the peripheral portions is less dense than that in thecentral region. As such, FIG. 5 may merely have an augmented bitallocation, similar to adjustment 410 in FIG. 4. Additionally, asdiscussed above, the augmented bit allocation as provided in FIG. 5 maybe within a threshold of encoding cost associated with a CBR ofreconstructed video.

Additionally, FIG. 6 illustrates a global optical flow field 600 havingdifferent degrees of object movement for encoding a video frame, inaccordance with embodiments of the invention. As seen in FIG. 6, someobjects near the top of the optical flow field are relatively still. Inparticular, objects that seem to be relatively still may be far awayfrom an image capture device, as objects that are moving at the samespeed will have differing perceived speeds based on the distance of theobject from a video capture device. Alternatively, objects that aremoving at a constant speed may appear to be relatively still if a videocapture device is moving at the same speed and in the same direction asthe objects. In examples, the video capture device may be moving at aparticular speed based upon movement of a UAV that attaches the videocapture device. Alternatively, the video capture device may be moving ata particular speed based on the movement of the video capture deviceitself relative to a UAV to which it is attached.

When a significant amount of area within an optical flow fieldassociated with a video frame appears to be relatively still, anencoding processor may choose to reduce the amount of bits that areallocated to the video frame. In particular, the encoding processor mayshift some bits that may otherwise be allocated to video frames havingstill areas and may allocate those bits to video frames having areaswith greater amounts of motion.

In contrast to the upper portion of the optical flow field in FIG. 6,some objects that are in the central and lower parts of the optical flowfield are moving relatively fast. In particular, objects may seem tomove relatively fast based on their movement relative to a video capturedevice. In particular, if a video capture device is moving quickly pasta stationary object, the stationary object may seem to be moving quicklybased on the movement of the video capture device. In examples, theperceived movement of objects may have a motion component that isassociated with movement of the video capture device and/or may have amotion component that is associated with movement of a movable object,such as UAV, that attaches the video capture device.

However, given the large amount of area within the video frame that isrelatively still, the overall allocation of bits to the video frame ofFIG. 6 may still be reduced. As such, the video frame provided in FIG. 6may have a reduced bit allocation, similar to adjustment 420 in FIG. 4.

In another example, FIG. 7 illustrates an optical flow field 700 that isassociated with ultra-fast global camera motion for encoding a videoframe, in accordance with embodiments of the invention. In particular,the optical flow field 700 that is provided in FIG. 7 has a uniformlydownward direction. Additionally, the downward direction of motionarrows is illustrated as being fast due to a high density of arrows. Inexamples, the downward direction of the optical flow field may appear tobe fast in the video data based on one or more objects that are movingquickly past a video capture device. In other examples, the downwarddirection of the optical flow field may appear to be fast in the videodata based on the movement of a video capture device relative to objectwithin the captured video data. In further examples, the downwarddirection of motion arrows within the optical flow field may appear tobe fast in the video data based on a combination of the objects that aremoving quickly past the video capture device and the fast movement ofthe video capture device itself.

As the directionality in the optical flow field has a uniformly downwarddirection, the same amount of bits may be allocated across the videoframe. However, given the great amount of movement, there may beinsufficient bits available to capture the high degree of motion.Accordingly, when a significant amount of area within an optical flowfield associated with a video frame appears to move relatively fast, anencoding processor may choose to select a quantization step (or toselect an increased quantization step) to use when encoding video dataassociated with the video frame. As such, the video frame provided inFIG. 7 may have an increased quantization step selected by the encodingprocessor, similar to adjustment 430 in FIG. 4.

Additional examples of video frames that may have differing degrees ofbit allocation versus quantization step selection, given constantencoding cost per video frame, are provided in FIGS. 8-11. In examples,the optical flow field can help to contextualize the video data so as tohelp the encoding processor encode video data within a video frame on ablock-by-block basis. In particular, among different blocks within avideo frame, the optical flow field can indicate whether some portion ofa video frame are moving faster than other portions. These portions ofthe video frame may be represented by blocks within the video frame. Assuch, on a block-by-block basis, the encoding processor may allocate thebitrate within a frame globally and differentially across blocks withinthe video frame. In particular, the encoding processor may allocate morebits to blocks when an optical flow field indicates that the objectsmoving through the blocks are moving very fast. In examples where thereare not enough bits available to allocate to blocks that are associatedwith a high amount of motion, the encoding processor may instead choosea quantization step (or a higher quantization step) so as to counteractthe increase of the bitrate that would otherwise be caused by the highdegree of motion within the blocks. Additionally, the encoding processormay decrease the number of bits allocated to blocks that are relativelystill. Instead, the encoding processor may provide the bit allocation toanother block(s) that may have a high degree of motion.

In examples, FIG. 8 illustrates two video frame components, which are tobe encoded, within an optical flow field 800 that is associated withangled global motion, in accordance with embodiments of the invention.In particular, the optical flow field that is provided in FIG. 8 has auniformly angled direction towards the bottom right corner of theoptical flow field. In examples, the direction of motion arrows withinthe optical flow field may appear to be angled in the video data basedon one or more objects that are moving at an angle past a video capturedevice. In other examples, the direction of motion arrows within theoptical flow field may appear to be angled in the video data based on anangled movement of a video capture device relative to objects within thecaptured video data. In further examples, the direction of motion arrowswithin the optical flow field may appear to be angled in the video databased on a combination of the objects that are moving at an angle pastthe video capture device and the movement of the video capture deviceitself.

FIG. 8 also provides two blocks, block 810 and block 820, that are videoframe components of a video frame. In examples, an encoding processorthat is encoding the video frame having blocks 810 and 820 may allocatebits evenly or unevenly across the video frame. In particular, thedistribution of bits across the video frame may be based on motion datathat is associated with the video frame. As seen in FIG. 8, the motiondata provided by the optical flow field indicates that there is uniformmotion across the video frame. As such, the encoding processor mayallocate an equal amount of bits across blocks 810 and 820. In this way,FIG. 8 may have a standardized bit allocation, similar to adjustment 440in FIG. 4.

Additionally, FIG. 9 illustrates two video frame components, which areto be encoded, within an optical flow field that is associated with azoom-in feature that is associated with a camera, in accordance withembodiments of the invention. In examples, the zoom-in feature may occurbased on a video capture device zooming in on an object; based on thesupport area of an aerial vehicle that allows a camera to move incloser; or a combination of the two. As seen in FIG. 9, movement at theedge of the optical flow field is larger than movement at the middle ofthe optical flow field. Additionally, the directionality of the zoom-inis equal across the optical flow field. In other words, there is noapparent bias in a vertical or horizontal distance, as each direction ismoving in a similar fashion. However, while there is no directionalbias, the motion within FIG. 9 is more concentrated near central areasand the motion within FIG. 9 is more sparse near peripheral areas.

FIG. 9 also provides two blocks, block 910 and block 920, that are videoframe components of a video frame. In examples, an encoding processorthat is encoding the video frame having blocks 910 and 920 may allocatebits evenly or unevenly across the video frame. In particular, thedistribution of bits across the video frame may be based on motion datathat is associated with the video frame. As seen in FIG. 9, the motiondata provided by the optical flow field indicates that there is agreater concentration in the central portion of the video frame than theperipheral portion of the video frame. Additionally, block 910 isrelatively centrally located while block 920 is located closer to theperipheral portion of the video frame. Accordingly, an encodingprocessor may allocate more bits to block 910, as block 910 has acentralized location in the video frame that has a high degree ofmotion. Conversely, block 920 may be allocated a standard amount of bitsand/or a lesser amount of bits than block 910. As such, the encodingprocessor may allocate an unequal amount of bits across blocks 910 and920. Accordingly, FIG. 9 may have a disproportionate bit allocation,similar to adjustment 450 in FIG. 4.

The relationship of the perceived size of objects within an optical flowfield may vary based on location of the objects within the optical flowfield. For example, when an optical flow field is generated based on azoom-in action, objects that are the same size in real life may appearto be larger as they are located further to the edge of the optical flowfield. This is illustrated in FIG. 9, which illustrates a first ball 930that is near a normalized minimum at the center of the optical flowfield and a second ball 940 that is near a periphery of the optical flowfield. Although first ball 930 and second ball 940 are of equal size,they appear to be of different sizes when viewed in context of theoptical flow field. Accordingly, the perceived size of objects may varyacross optical flow fields. In particular, the perceived size of objectsmay vary in a manner that is directly proportional, inverselyproportional, or modeled by another equation as objects are placed atdifferent locations across the optical flow field.

In additional examples, FIG. 10 illustrates two video frame components,which are to be encoded, within an optical flow field 1000 that isassociated with a rotating view from above, in accordance withembodiments of the invention. As seen in FIG. 5, motion within theoptical flow field is indicated using arrows. The length of the arrowsindicates the amount of movement that is occurring across the opticalflow field, and the curve of the arrows indicates the direction ofmovement that is occurring across the optical flow field. FIG. 10 alsoprovides two blocks, block 1010 and block 1020, that are video framecomponents of a video frame. When an encoding processor encodes thevideo frame, a distribution of bits across the video frame may be basedon motion data that is associated with the video frame. As seen in FIG.10, the motion data provided by the optical flow field indicates thatthe relative motion associated with rotation within the video frame isgenerally constant. However, similar to FIG. 9, the optical flow fieldwithin FIG. 10 also indicates that there is a greater concentration ofmotion in the central portion of the video frame than the peripheralportion of the video frame. Additionally, block 1010 is relativelycentrally located while block 1020 is located closer to the peripheralportion of the video frame. Accordingly, an encoding processor mayallocate more bits to block 1010, as block 1010 has a centralizedlocation in the video frame that has a greater amount of motion. Theencoding processor may also allocate additional bits to block 1020, butthe augmented bits for block 1020 may be less than the amount of bitsallocated to block 1010. As such, the encoding processor may allocate anunequal amount of bits across blocks 1010 and 1020. In this way, FIG. 10may have an uneven, but mutually augmented, bit allocation, similar toadjustment 460 in FIG. 4.

Further, FIG. 11 illustrates three video frame components, which are tobe encoded, within a global optical flow field 1100 having differentdegrees of object movement, in accordance with embodiments of theinvention. In particular, FIG. 11 provides an example of an optical flowfield that has different rates of movement associated with objectswithin a video frame. As seen in FIG. 11, some objects near the top ofthe optical flow field are relatively still. In contrast, some objectsthat are in the central and lower part of the optical flow field aremoving relatively fast. In particular, objects may seem to moverelatively fast based on their movement relative to a video capturedevice.

An encoding processor that encodes the video frame provided in FIG. 11may provide at least three categories of bit distribution across thevideo frame may be based on motion data that is associated with thevideo frame. In example, the encoding processor may provide 1 or more, 2or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 ormore, 9 or more, 10 or more, 11 or more, 12 or more, 13 or more, 14 ormore, 15 or more, 16 or more, 17 or more, 18 or more, 19 or more, 20 ormore, 21 or more, 22 or more, 23 or more, 24 or more, 25 or more, 30 ormore, 35 or more, 40 or more, 45 or more, 50 or more, 60 or more, 70 ormore, 80 or more, 90 or more, 100 or more, or more than 100 categoriesof bit distribution. Not all categories that are available for bitdistribution by the encoder may be present in any given video frame. Inexamples, however, at least one bit distribution category may beprovided in each video frame that is encoder by the encoding processor.As seen in FIG. 11, the motion data provided by the optical flow fieldindicates that the relative motion associated with rotation within thevideo frame falls into at least three general categories: fast,standard, and relatively still. In particular, block 1110 is fast, block1120 is standard, and block 1130 is relatively still. Accordingly, anencoding processor may allocate more bits to block 1110 than to block1120, as block 1110 has a greater amount of motion than block 1120.Additionally, the encoding processor may allocate more bits to block1120 than to block 1130, as block 1120 has a greater amount of motionthan block 1130. In examples where the stillness of block 1130 fallsbelow a movement threshold, the encoding processor may reduce the amountof bits allocated to block 1130. As such, the encoding processor mayallocate an unequal amount of bits across blocks 1110, 1120, and 1130.In this way, FIG. 11 may have bit allocation associated with multiplecategories, similar to adjustment 470 in FIG. 4.

Further examples of video frames that may have differing degrees of bitallocation versus quantization step selection, given constant encodingcost per video frame, are provided in FIGS. 12-14. In examples, theoptical flow field can help to contextualize the video data so as tohelp the encoding processor encode video data within a video framewithin and between video frame components. In particular, within andbetween block components, an optical flow field may be used to tune thebits that are allocated on 1) identifying a motion vector, and 2)calculating a coefficient. In examples, when motion within a video frameis severe, thereby increase the total amount of motion information thatis associated with the video frame, an example of a tuning strategy mayallocate more bits to calculating a coefficient rather than identifyinga motion vector. In particular, if more bits are allocated towardscalculating a coefficient, the integrity of a consistent motion vectorsfield may be maintained. Under this strategy, the maintenance of themotion vectors field is prioritized over the search for a motion vector,as it is generally very costly in terms of bit allocation to search fora motion vector when motion between video frame components is above acertain threshold of motion. Additionally, when the motion dataassociated with a video frame exceeds a certain threshold of activity,it is easier for the encoding processor to make a mistake in theidentification of the motion vector. Further, the misidentification of amotion vector may propagate a series of errors that are not generallyeasy to trace back. Accordingly, under some strategies, bits arepreferentially allocated to calculating an accurate coefficient ratherthan identifying a motion vector.

In examples, the prioritization of calculating a coefficient overidentifying a motion vector may be applied both in determining a currentblock's quantization step as well as contributing to the RDO in a motionsearch. Accordingly, if motion within a video frame is severe (e.g.,exceeds a certain threshold), the RDO cost function may be adjusted sothat a more precise motion vector may be identified. In this way, bitsthat may be allocated to encode the residual data between video framecomponents may be saved. Additionally or alternatively, a smallerquantization step may be applied to produce visual quality ofreconfigured frames that exceeds a threshold associated with thedetermined RDO.

Accordingly, the calculation of coefficients when encoding video datamay be prioritized over the identification of motion vectors when motionwithin a video frame is severe. In particular, the calculation ofcoefficients may be based on residual data between video frames whenencoding video data, such as when an encoding processor utilizes intracoding and/or inter coding. Accordingly, FIG. 12 illustrates examples ofintra coding of pixels within a block in a video frame component, inaccordance with embodiments of the invention.

Intra coding may be used to condense spatial correlations. For a blockwithin a video frame, a predictor of pixel values within the block maybe estimated from its neighboring pixels. For example, a predictor ofpixel values may be estimated from neighboring pixels such as the upper,left upper right, and lower left neighboring pixels. Examples of thesepredications may be directional so as to correspond with the patternwithin a pixel block. A demonstration of H.264 directional intraprediction is provided in FIG. 12.

FIG. 12 illustrates examples of intra coding of pixels within a block ina video frame component, in accordance with embodiments of theinvention. As seen in FIG. 12, pixels that are adjacent to a block maybe used to predict motion of pixels within the block. In particular,when intra coding is used, the pixels adjacent to a block are assessedfor motion data. In FIG. 12, the pixels that are assessed are in acolumn to the left of the block and in a column above the block. Theassessed motion of the blocks may be associated with a particular modethat is used by an encoding processor. As all of the adjacent pixels maynot have the same motion information, a mode of assessed motion may beassigned to a block when the adjacent pixels have a threshold number ofpixels being associated with a particular mode. In examples, theadjacent pixels may be assigned to a particular mode when any of 100%,95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, over 50%, 50%, or amajority of adjacent pixels are associated with a particular mode.

Additionally, the mode that is assigned to the adjacent pixels may beused to determine the predictive motion of the pixels in the block. Forexample, in mode 0, the pixels that are adjacent to a block may beassessed as having a downward motion. As seen in FIG. 12, this downwardmotion may be used to predict a downward motion through the predictivepixels. As provided in FIG. 12, the downward motion through thepredictive pixels is entirely based on the assessed motion of the uppercolumn of adjacent pixels above the block.

In mode 1, the pixels that are adjacent to a block may be assessed ashaving a sideways motion. As seen in FIG. 12, this sideways motion maybe used to predict a motion to the right throughout the predictivepixels. As provided in FIG. 12, the sideways motion through thepredictive pixels is entirely based on the assessed motion of the leftcolumn of adjacent pixels next to the block. In mode 2, the pixels thatare adjacent to the block may be assessed as having a normal, orneutral, movement. Based on this assessment, the pixels within the blockmay be assessed to have a neutral movement as well.

In mode 3, the pixels that are adjacent to a block, and in closeproximity to the upper portion of the block, may be assessed as having aleftward angled motion. As seen in FIG. 12, this leftward angled motionmay be used to predict a motion to the downward left throughout thepredictive pixels. As provided in FIG. 12, the downwardly sidewaysmotion through the predictive pixels is entirely based on the assessedmotion of the upper column of adjacent pixels next to the block, as wellas an upper column of pixels that are in close proximity to the block.Similarly, in mode 7, the pixels that are adjacent to the block may alsobe assessed as having a downward leftward angled motion. However, theangle of the downward leftward angled motion as seen in mode 7 may besteeper than the downward angled motion as seen in mode 3.

In mode 4, the pixels that are adjacent to the block may be assessed ashaving a rightward angled motion. As seen in FIG. 12, this rightwardangled motion may be used to predict a motion to the downward rightthroughout the predictive pixels. Similarly, in mode 5 the pixels thatare adjacent to the block may also be assessed as having a rightwardangled motion, though the angled motion as illustrated in mode 5 issteeper than the angled motion in mode 4. Additionally, in mode 6 thepixels that are adjacent to the block may also be assessed as having arightward angled motion, though the angled motion as illustrated in mode6 is more shallow than the angled motion in modes 4 or 5.

Additionally, mode 8 provides adjacent pixels to a block that indicate amotion that is upwards and to the right. However, mode 8 differs fromprevious modes in that mode 8 is only able to predict a portion of theblock. For assessing the additional predictive pixels within the block,other auxiliary methods may be used.

While intra coding utilizes neighboring pixels of a block, such aspixels on the left column and the upper row of a current block, theremay be a significant amount of residual information that is includedwithin the central pixels of a block. In examples, the central pixels ofa block may include textures, objects, and other information that maynot be readily predicted using intra coding. To capture thisinformation, information between frames (e.g. temporal compression) maybe condensed and encoded.

Inter coding may be used to condense temporal correlations. For a blockwithin a video frame, a predictor of pixel values within the block maybe estimated from a correlating block within a previous frame. As videoframes may only be separated by a few millionths of a second, blocksbetween frames may not generally differ greatly. However, the use ofinter coding may be useful for predicting details within a block thatwould not be captured using intra frame coding. In particular, thesedetails are predicted by referencing block from nearby video frames. Inparticular, blocks that are correlated between frames may be linkedusing a motion vector.

When implementing inter coding, initially an inter frame motionestimation may be performed on the encoding block. The motion estimationprocess may determine a grid of pixels which may be considered mostsimilar and most costless to a current block. In particular, the motionestimation may determine the grid of pixels that is considered mostsimilar by conducting a search within a search area of a video frame.Once a grid of pixels which is considered the most similar and mostcostless to the current block is determined, a motion vector may becalculated. In particular, the motion vector may be calculated ascomprising the 2D pixel location difference between the current block ofa first frame and its reference block of a video frame that istemporally related to the first frame. In examples, the 2D pixellocation difference may use subpixel interpolation so as to definemotion between frames by integer pixels, half pixels, quarter pixels,etc. An illustration of calculating a motion vector is illustrated inFIG. 13.

Accordingly, FIG. 13 provides an illustration 1300 of motion vectorslinking co-located blocks across video frames, in accordance withembodiments of the invention. In particular, FIG. 13 illustrates motionvectors linking co-located blocks across video frames, in accordancewith embodiments of the invention. As seen in FIG. 13, a motion vector1310 may link blocks 1320-1340 across video frames. Using the motionvector, a calculated motion vector may be predicted from neighboringand/or nearby video frames, even if those neighboring video frames areahead in time, as illustrated by calculated backward motion vector (MV)1312 and calculated forward MV 1314. This may be due to the compressionof information between inter coding. In particular, during inter coding,temporal information may be compressed, particularly by linking blockstogether using motion vectors and other relational information.

Once a motion vector is determined, the motion vector may be provided toa decoder side within the encoding system. When the decoder receivesthis information, the decoder may find a corresponding location of afirst block on a reference frame that may be linked to a block that isbeing processed. In this way, the motion vector may be used by thedecoder to find a reference. Subsequently, the difference between thereference and the current block (e.g., the motion vector) may beprocessed and transmitted.

Header information coding may also be used to efficiently encode videodata. In particular, header information that is related to a motionvector and header information that is related to a skip mode may be usedto encode video data that is captured by a UAV.

Regarding motion vectors, a current block and its spatial neighboringblock within the same video frame may have a high probability of sharingthe same motion vectors. Moreover, the motion vector temporallycorresponding to a current block may also serve as a predictor of themotion vector of the current block. As such, a motion vector predictor(MVP) for a current block may be calculated based on a current block'sspatially and temporally neighboring blocks. The calculation of a MVPmay depend on the standards of an encoding processor.

Additionally, regarding a skip mode, additional information that iswithin a header of a current block may also be predicted fromneighboring blocks. Further, in examples where a current block may befully predicted from its neighboring blocks, the header of the currentblock may be marked as a skip block. In particular, a skip block may beused to indicate that no residual information is transmitted. Inexamples, a skip may be used when the information within the currentblock may be calculated based on the information of blocks that neighborthe current block.

FIG. 14 illustrates a structure of prioritizing calculation of acoefficient between frames rather than searching for a motion vector, inaccordance with embodiments of the invention. In particular, FIG. 14provides an illustration 1400 of two video frames within a video that iscaptured by a UAV. The two video frames include objects such as trees, acoast, and a boat. In particular, a first frame 1410 is a currentlyencoded frame and second, adjacent frame 1420 is a predictive frame. Interms of calculating a coefficient, the differences between the firstframe 1410 and the second frame 1420 may be assessed. As provided inFIG. 14, a residual amount consists of additional portions of trees, aswell as the removal of a portion of the board between the pictures. Inexamples, a residual amount between two frames comprises the differencesbetween the two frames. Additionally, a block 1415 of the currentlyencoded frame is associated with a particular motion vector.

In examples when motion data within a video frame is severe, bits may bepreferably allocated towards calculating a coefficient. For examples,bits may be allocated towards residual describing new trees in thesecond frame, as well as a residual discussing a removal of the boat. Inparticular, the difference between an original block and its predictormay be called the residual, and this residual between blocks may berepresented as a coefficient. Additionally, the motion data within avideo frame may be determined to be severe when the motion data exceedsa particular threshold of an amount of motion data associated with thevideo frame. This may be determined based on an optical flow field thatis aligned with the video frame. Additionally or alternatively, motiondata that is associated with a video frame may be calculated byassessing motion data of adjacent and/or nearby video frames.

In other examples, such as when the motion data within a video framedoes not exceed a threshold of motion data so as to be deemed “severe,”bits may be allocated equally between calculating a coefficientassociated with the video frame and identifying a motion vector withinthe video frame. In particular, a motion vector may be identified byproviding a search area within the video frame. As motion within a videoframe is increasingly intense, the size of a search area within thevideo frame may be increased. In examples, the size of a search area maybe increased as the intensity of motion within a video frame isincreased. Additionally, as the intensity of motion within a video frameis increased, the shape of the search area may be modified. Inparticular, as the intensity of the motion within the video frame isincreased, the search area may be modified from a square to a circle.The shape of the search area may also be modified based on the opticalflow field. In particular, if an optical flow field indicates that thereis a high degree of vertical movement, the search area within a videoframe may have an increased vertical component, such as changing theshape of the search are a square to a vertically biased rectangle. Anillustration of modifying the search area associated with a block ofadjacent frame 1420 is provided. In particular, the search area ismodified so as to increase the chances of the motion estimationprediction evaluation to identify the motion vector that corresponds tothe block within the second frame. When evaluating frame 1420 for amotion vector to link block 1425 with encoded block 1415, a search area1430 may be assessed.

FIG. 15 is a flow chart illustrating a method 1500 of determining aquantization step for encoding video based on motion data, in accordancewith embodiments of the invention. At block 1510, video captured by animage capture device is received. In particular, the video comprises avideo frame component. The image capture device may be installed on amovable object, such as an unmanned aerial vehicle. Additionally, thevideo that is captured may be captured by the image capture device whilethe UAV is in flight. At block 1520, motion data associated with thevideo frame component is received. In examples, the motion data mayinclude optical flow field data. Alternatively, motion data may includesensor data. In examples, optical flow field data may be generated fromsensor data. In additional examples, the motion data may indicate that afirst portion of the video frame has a higher degree of movement than asecond portion of the video frame. Additionally, at block 1530, aquantization step for encoding the video frame component based on themotion data is determined. In examples, determining a quantization stepmay comprise an encoding processor choosing a first quantization stepfor encoding the first portion of a video frame and choosing a second,less comprehensive quantization step for encoding a second portion ofthe video frame.

FIG. 16 is a flow chart illustrating another method 1600 of determininga quantization step for encoding video based on motion data, inaccordance with embodiments of the invention. At block 1610, videocaptured by an image capture device, the video comprising a first videoframe component and a second video frame component is received. Thevideo comprises a video frame. The video may be captured by an imagecapture device. At block 1620, motion data associated with the secondvideo frame component is received. In examples, the motion data may beobtained using one or more sensors. In further examples, the sensors maycomprise on or more of an optical sensor, ultrasonic sensor, MVO,gyroscope, GPS, and altimeter. Additionally, at block 1630, aquantization step for encoding the first video frame component isdetermined based on the motion data associated with the second videoframe component. In particular, the determining a quantization step maycomprise choosing a quantization step for encoding the first video framethat minimizes loss for encoding a coefficient between a first videoframe and a second video frame. In additional examples, the quantizationstep may be determined block-by-block within the video frame.

FIG. 17 is a flow chart illustrating a method 1700 of bit allocation forencoding video based on motion data, in accordance with embodiments ofthe invention. At block 1710, video captured by an image capture deviceis received. The image capture device may be installed on a movableobject. In particular, the image capture device may be installed on aUAV. Additionally, the video comprises a video frame component.

At block 1720, motion data associated with the video frame component isreceived. The motion data may include optical flow field data.Additionally, the motion data may indicate that the block has movementthat exceeds a predetermined threshold. At block 1730, bits associatedwith encoding the video frame component are allocated based on themotion data. In examples, an amount of bit for encoding a block may beallocated so as to be commensurate with a block having movement thatexceeds a predetermined threshold. In other examples, allocating bitsmay comprise choosing an amount of allocating bits, wherein a higheramount of allocating bits is chosen when the motion data indicates ahigher degree of movement, relative to a lower amount of allocating bitsthat is chosen when the motion data indicates a lower degree ofmovement.

FIG. 18 is a flow chart illustrating another method 1800 of bitallocation for encoding video based on motion data, in accordance withembodiments of the invention. At block 1810, video captured by an imagecapture device is received. The video comprises a first and second videoframe component. In examples, the image capture device is a camera. Inadditional examples, the first video component may be a first videoframe and the second video component may be a second video frame.Additionally, the first video frame may be adjacent in time to thesecond video frame. At block 1820, motion data associated with thesecond video frame component is received. In examples, the motion datamay be obtained using one or more sensors. Additionally, at block 1830,bits associated with encoding the first video frame component areallocated based on the motion data associated with the second videoframe component. In examples, bits may be allocated for encoding thefirst video frame to minimize loss for encoding a coefficient betweenthe first video frame and the second video frame. In other examples,bits may be allocated for encoding the first block that minimizes lossfor encoding the coefficient between the first block and the secondblock.

The systems, devices, and methods described herein for video encodingmay apply to any video that is captured by a video capture devicesupported by a variety of objects. In particular, the video may becaptured by a video capture device that is supported by an aerialvehicle. As previously mentioned, any description herein of an aerialvehicle, such as a UAV, may apply to and be used for any movable object.Any description herein of an aerial vehicle may apply specifically toUAVs. A movable object of the present invention may be configured tomove within any suitable environment, such as in air (e.g., a fixed-wingaircraft, a rotary-wing aircraft, or an aircraft having neither fixedwings nor rotary wings), in water (e.g., a ship or a submarine), onground (e.g., a motor vehicle, such as a car, truck, bus, van,motorcycle, bicycle; a movable structure or frame such as a stick,fishing pole; or a train), under the ground (e.g., a subway), in space(e.g., a spaceplane, a satellite, or a probe), or any combination ofthese environments. The movable object may be a vehicle, such as avehicle described elsewhere herein. In some embodiments, the movableobject may be carried by a living subject, or take off from a livingsubject, such as a human or an animal. Suitable animals may includeavines, canines, felines, equines, bovines, ovines, porcines, delphines,rodents, or insects.

The movable object may be capable of moving freely within theenvironment with respect to six degrees of freedom (e.g., three degreesof freedom in translation and three degrees of freedom in rotation).Alternatively, the movement of the movable object may be constrainedwith respect to one or more degrees of freedom, such as by apredetermined path, track, or orientation. The movement may be actuatedby any suitable actuation mechanism, such as an engine or a motor. Theactuation mechanism of the movable object may be powered by any suitableenergy source, such as electrical energy, magnetic energy, solar energy,wind energy, gravitational energy, chemical energy, nuclear energy, orany suitable combination thereof. The movable object may beself-propelled via a propulsion system, as described elsewhere herein.The propulsion system may optionally run on an energy source, such aselectrical energy, magnetic energy, solar energy, wind energy,gravitational energy, chemical energy, nuclear energy, or any suitablecombination thereof. Alternatively, the movable object may be carried bya living being.

In some instances, the movable object may be an aerial vehicle. Forexample, aerial vehicles may be fixed-wing aircraft (e.g., airplane,gliders), rotary-wing aircraft (e.g., helicopters, rotorcraft), aircrafthaving both fixed wings and rotary wings, or aircraft having neither(e.g., blimps, hot air balloons). An aerial vehicle may beself-propelled, such as self-propelled through the air. A self-propelledaerial vehicle may utilize a propulsion system, such as a propulsionsystem including one or more engines, motors, wheels, axles, magnets,rotors, propellers, blades, nozzles, or any suitable combination thereofIn some instances, the propulsion system may be used to enable themovable object to take off from a surface, land on a surface, maintainits current position and/or orientation (e.g., hover), changeorientation, and/or change position.

The movable object may be controlled remotely by a user or controlledlocally by an occupant within or on the movable object. The movableobject may be controlled remotely via an occupant within a separatevehicle. In some embodiments, the movable object is an unmanned movableobject, such as a UAV. An unmanned movable object, such as a UAV, maynot have an occupant on-board the movable object. The movable object maybe controlled by a human or an autonomous control system (e.g., acomputer control system), or any suitable combination thereof. Themovable object may be an autonomous or semi-autonomous robot, such as arobot configured with an artificial intelligence.

The movable object may have any suitable size and/or dimensions. In someembodiments, the movable object may be of a size and/or dimensions tohave a human occupant within or on the vehicle. Alternatively, themovable object may be of size and/or dimensions smaller than thatcapable of having a human occupant within or on the vehicle. The movableobject may be of a size and/or dimensions suitable for being lifted orcarried by a human. Alternatively, the movable object may be larger thana size and/or dimensions suitable for being lifted or carried by ahuman. In some instances, the movable object may have a maximumdimension (e.g., length, width, height, diameter, diagonal) of less thanor equal to about: 2 cm, 5 cm, 10 cm, 50 cm, 1 m, 2 m, 5 m, or 10 m. Themaximum dimension may be greater than or equal to about: 2 cm, 5 cm, 10cm, 50 cm, 1 m, 2 m, 5 m, or 10 m. For example, the distance betweenshafts of opposite rotors of the movable object may be less than orequal to about: 2 cm, 5 cm, 10 cm, 50 cm, 1 m, 2 m, 5 m, or 10 m.Alternatively, the distance between shafts of opposite rotors may begreater than or equal to about: 2 cm, 5 cm, 10 cm, 50 cm, 1 m, 2 m, 5 m,or 10 m.

In some embodiments, the movable object may have a volume of less than100 cm×100 cm×100 cm, less than 50 cm×50 cm×30 cm, or less than 5 cm×5cm×3 cm. The total volume of the movable object may be less than orequal to about: 1 cm³, 2 cm³, 5 cm³, 10 cm³, 20 cm³, 30 cm³, 40 cm³, 50cm³, 60 cm³, 70 cm³, 80 cm³, 90 cm³, 100 cm³, 150 cm³, 200 cm³, 300 cm³,500 cm³, 750 cm³, 1000 cm³, 5000 cm³, 10,000 cm³, 100,000 cm³3, 1 m³, or10 m³. Conversely, the total volume of the movable object may be greaterthan or equal to about: 1 cm³, 2 cm³, 5 cm³, 10 cm³, 20 cm³, 30 cm³, 40cm³, 50 cm³, 60 cm³, 70 cm³, 80 cm³, 90 cm³, 100 cm³, 150 cm³, 200 cm³,300 cm³, 500 cm³, 750 cm³, 1000 cm³, 5000 cm³, 10,000 cm³, 100,000 cm³,1 m³, or 10 m³.

In some embodiments, the movable object may have a footprint (which mayrefer to the lateral cross-sectional area encompassed by the movableobject) less than or equal to about: 32,000 cm², 20,000 cm², 10,000 cm²,1,000 cm², 500 cm², 100 cm², 50 cm², 10 cm², or 5 cm². Conversely, thefootprint may be greater than or equal to about: 32,000 cm², 20,000 cm²,10,000 cm², 1,000 cm², 500 cm², 100 cm², 50 cm², 10 cm², or 5 cm².

In some instances, the movable object may weigh no more than 1000 kg.The weight of the movable object may be less than or equal to about:1000 kg, 750 kg, 500 kg, 200 kg, 150 kg, 100 kg, 80 kg, 70 kg, 60 kg, 50kg, 45 kg, 40 kg, 35 kg, 30 kg, 25 kg, 20 kg, 15 kg, 12 kg, 10 kg, 9 kg,8 kg, 7 kg, 6 kg, 5 kg, 4 kg, 3 kg, 2 kg, 1 kg, 0.5 kg, 0.1 kg, 0.05 kg,or 0.01 kg. Conversely, the weight may be greater than or equal toabout: 1000 kg, 750 kg, 500 kg, 200 kg, 150 kg, 100 kg, 80 kg, 70 kg, 60kg, 50 kg, 45 kg, 40 kg, 35 kg, 30 kg, 25 kg, 20 kg, 15 kg, 12 kg, 10kg, 9 kg, 8 kg, 7 kg, 6 kg, 5 kg, 4 kg, 3 kg, 2 kg, 1 kg, 0.5 kg, 0.1kg, 0.05 kg, or 0.01 kg.

In some embodiments, a movable object may be small relative to a loadcarried by the movable object. The load may include a payload and/or acarrier, as described in further detail elsewhere herein. In someexamples, a ratio of a movable object weight to a load weight may begreater than, less than, or equal to about 1:1. In some instances, aratio of a movable object weight to a load weight may be greater than,less than, or equal to about 1:1. Optionally, a ratio of a carrierweight to a load weight may be greater than, less than, or equal toabout 1:1. When desired, the ratio of an movable object weight to a loadweight may be less than or equal to: 1:2, 1:3, 1:4, 1:5, 1:10, or evenless. Conversely, the ratio of a movable object weight to a load weightmay also be greater than or equal to: 2:1, 3:1, 4:1, 5:1, 10:1, or evengreater.

In some embodiments, the movable object may have low energy consumption.For example, the movable object may use less than about: 5 W/h, 4 W/h, 3W/h, 2 W/h, 1 W/h, or less. In some instances, a carrier of the movableobject may have low energy consumption. For example, the carrier may useless than about: 5 W/h, 4 W/h, 3 W/h, 2 W/h, 1 W/h, or less. Optionally,a payload of the movable object may have low energy consumption, such asless than about: 5 W/h, 4 W/h, 3 W/h, 2 W/h, 1 W/h, or less.

FIG. 19 illustrates an unmanned aerial vehicle (UAV) 1900, in accordancewith embodiments of the present invention. The UAV may be an example ofa movable object as described herein. The UAV 1900 may include apropulsion system having four rotors 1902, 1904, 1906, and 1908. Anynumber of rotors may be provided (e.g., one, two, three, four, five,six, or more). The rotors, rotor assemblies, or other propulsion systemsof the unmanned aerial vehicle may enable the unmanned aerial vehicle tohover/maintain position, change orientation, and/or change location. Thedistance between shafts of opposite rotors may be any suitable length1910. For example, the length 1910 may be less than or equal to 2 m, orless than equal to 5 m. In some embodiments, the length 1910 may bewithin a range from 40 cm to 1 m, from 10 cm to 2 m, or from 5 cm to 5m. Any description herein of a UAV may apply to a movable object, suchas a movable object of a different type, and vice versa. The UAV may usean assisted takeoff system or method as described herein.

In some embodiments, the movable object may be configured to carry aload. The load may include one or more of passengers, cargo, equipment,instruments, and the like. The load may be provided within a housing.The housing may be separate from a housing of the movable object, or bepart of a housing for a movable object. Alternatively, the load may beprovided with a housing while the movable object does not have ahousing. Alternatively, portions of the load or the entire load may beprovided without a housing. The load may be rigidly fixed relative tothe movable object. Optionally, the load may be movable relative to themovable object (e.g., translatable or rotatable relative to the movableobject). The load may include a payload and/or a carrier, as describedelsewhere herein.

In some embodiments, the movement of the movable object, carrier, andpayload relative to a fixed reference frame (e.g., the surroundingenvironment) and/or to each other, may be controlled by a terminal. Theterminal may be a remote control device at a location distant from themovable object, carrier, and/or payload. The terminal may be disposed onor affixed to a support platform. Alternatively, the terminal may be ahandheld or wearable device. For example, the terminal may include asmartphone, tablet, laptop, computer, glasses, gloves, helmet,microphone, or suitable combinations thereof. The terminal may include auser interface, such as a keyboard, mouse, joystick, touchscreen, ordisplay. Any suitable user input may be used to interact with theterminal, such as manually entered commands, voice control, gesturecontrol, or position control (e.g., via a movement, location or tilt ofthe terminal).

The terminal may be used to control any suitable state of the movableobject, carrier, and/or payload. For example, the terminal may be usedto control the position and/or orientation of the movable object,carrier, and/or payload relative to a fixed reference from and/or toeach other. In some embodiments, the terminal may be used to controlindividual elements of the movable object, carrier, and/or payload, suchas the actuation assembly of the carrier, a sensor of the payload, or anemitter of the payload. The terminal may include a wirelesscommunication device adapted to communicate with one or more of themovable object, carrier, or payload.

The terminal may include a suitable display unit for viewing informationof the movable object, carrier, and/or payload. For example, theterminal may be configured to display information of the movable object,carrier, and/or payload with respect to position, translationalvelocity, translational acceleration, orientation, angular velocity,angular acceleration, or any suitable combinations thereof. In someembodiments, the terminal may display information provided by thepayload, such as data provided by a functional payload (e.g., imagesrecorded by a camera or other image capturing device).

Optionally, the same terminal may both control the movable object,carrier, and/or payload, or a state of the movable object, carrierand/or payload, as well as receive and/or display information from themovable object, carrier and/or payload. For example, a terminal maycontrol the positioning of the payload relative to an environment, whiledisplaying image data captured by the payload, or information about theposition of the payload. Alternatively, different terminals may be usedfor different functions. For example, a first terminal may controlmovement or a state of the movable object, carrier, and/or payload whilea second terminal may receive and/or display information from themovable object, carrier, and/or payload. For example, a first terminalmay be used to control the positioning of the payload relative to anenvironment while a second terminal displays image data captured by thepayload. Various communication modes may be utilized between a movableobject and an integrated terminal that both controls the movable objectand receives data, or between the movable object and multiple terminalsthat both control the movable object and receives data. For example, atleast two different communication modes may be formed between themovable object and the terminal that both controls the movable objectand receives data from the movable object.

FIG. 20 illustrates a movable object 2000 including a carrier 2002 and apayload 2004, in accordance with embodiments. Although the movableobject 2000 is depicted as an aircraft, this depiction is not intendedto be limiting, and any suitable type of movable object may be used, aspreviously described herein. One of skill in the art would appreciatethat any of the embodiments described herein in the context of aircraftsystems may be applied to any suitable movable object (e.g., an UAV). Insome instances, the payload 2004 may be provided on the movable object2000 without requiring the carrier 2002. The movable object 2000 mayinclude propulsion mechanisms 2006, a sensing system 2008, and acommunication system 2010.

The propulsion mechanisms 2006 may include one or more of rotors,propellers, blades, engines, motors, wheels, axles, magnets, or nozzles,as previously described. The movable object may have one or more, two ormore, three or more, or four or more propulsion mechanisms. Thepropulsion mechanisms may all be of the same type. Alternatively, one ormore propulsion mechanisms may be different types of propulsionmechanisms. The propulsion mechanisms 2006 may be mounted on the movableobject 2000 using any suitable means, such as a support element (e.g., adrive shaft) as described elsewhere herein. The propulsion mechanisms2006 may be mounted on any suitable portion of the movable object 2000,such on the top, bottom, front, back, sides, or suitable combinationsthereof.

In some embodiments, the propulsion mechanisms 2006 may enable themovable object 2000 to take off vertically from a surface or landvertically on a surface without requiring any horizontal movement of themovable object 2000 (e.g., without traveling down a runway). Optionally,the propulsion mechanisms 2006 may be operable to permit the movableobject 2000 to hover in the air at a specified position and/ororientation. One or more of the propulsion mechanisms 2000 may becontrolled independently of the other propulsion mechanisms.Alternatively, the propulsion mechanisms 2000 may be configured to becontrolled simultaneously. For example, the movable object 2000 may havemultiple horizontally oriented rotors that may provide lift and/orthrust to the movable object. The multiple horizontally oriented rotorsmay be actuated to provide vertical takeoff, vertical landing, andhovering capabilities to the movable object 2000. In some embodiments,one or more of the horizontally oriented rotors may spin in a clockwisedirection, while one or more of the horizontally rotors may spin in acounterclockwise direction. For example, the number of clockwise rotorsmay be equal to the number of counterclockwise rotors. The rotation rateof each of the horizontally oriented rotors may be varied independentlyin order to control the lift and/or thrust produced by each rotor, andthereby adjust the spatial disposition, velocity, and/or acceleration ofthe movable object 2000 (e.g., with respect to up to three degrees oftranslation and up to three degrees of rotation).

The sensing system 2008 may include one or more sensors that may sensethe spatial disposition, velocity, and/or acceleration of the movableobject 2000 (e.g., with respect to up to three degrees of translationand up to three degrees of rotation). The one or more sensors mayinclude global positioning system (GPS) sensors, motion sensors,inertial sensors, proximity sensors, or image sensors. The sensing dataprovided by the sensing system 2008 may be used to control the spatialdisposition, velocity, and/or orientation of the movable object 2000(e.g., using a suitable processing unit and/or control module, asdescribed below). Alternatively, the sensing system 2008 may be used toprovide data regarding the environment surrounding the movable object,such as weather conditions, proximity to potential obstacles, locationof geographical features, location of manmade structures, and the like.

The communication system 2010 enables communication with terminal 2012having a communication system 2014 via wireless signals 2016. Thecommunication systems 2010, 2014 may include any number of transmitters,receivers, and/or transceivers suitable for wireless communication. Thecommunication may be one-way communication, such that data may betransmitted in only one direction. For example, one-way communicationmay involve only the movable object 2000 transmitting data to theterminal 2012, or vice-versa. The data may be transmitted from one ormore transmitters of the communication system 2010 to one or morereceivers of the communication system 2012, or vice-versa.Alternatively, the communication may be two-way communication, such thatdata may be transmitted in both directions between the movable object2000 and the terminal 2012. The two-way communication may involvetransmitting data from one or more transmitters of the communicationsystem 2010 to one or more receivers of the communication system 2014,and vice-versa.

In some embodiments, the terminal 2012 may provide control data to oneor more of the movable object 2000, carrier 2002, and payload 2004 andreceive information from one or more of the movable object 2000, carrier2002, and payload 2004 (e.g., position and/or motion information of themovable object, carrier or payload; data sensed by the payload such asimage data captured by a payload camera). In some instances, controldata from the terminal may include instructions for relative positions,movements, actuations, or controls of the movable object, carrier and/orpayload. For example, the control data may result in a modification ofthe location and/or orientation of the movable object (e.g., via controlof the propulsion mechanisms 2006), or a movement of the payload withrespect to the movable object (e.g., via control of the carrier 2002).The control data from the terminal may result in control of the payload,such as control of the operation of a camera or other image capturingdevice (e.g., taking still or moving pictures, zooming in or out,turning on or off, switching imaging modes, change image resolution,changing focus, changing depth of field, changing exposure time,changing viewing angle or field of view). In some instances, thecommunications from the movable object, carrier and/or payload mayinclude information from one or more sensors (e.g., of the sensingsystem 2008 or of the payload 2004). The communications may includesensed information from one or more different types of sensors (e.g.,GPS sensors, motion sensors, inertial sensor, proximity sensors, orimage sensors). Such information may pertain to the position (e.g.,location, orientation), movement, or acceleration of the movable object,carrier and/or payload. Such information from a payload may include datacaptured by the payload or a sensed state of the payload. The controldata provided transmitted by the terminal 2012 may be configured tocontrol a state of one or more of the movable object 2000, carrier 2002,or payload 2004. Alternatively or in combination, the carrier 2002 andpayload 2004 may also each include a communication module configured tocommunicate with terminal 2012, such that the terminal may communicatewith and control each of the movable object 2000, carrier 2002, andpayload 2004 independently.

In some embodiments, the movable object 2000 may be configured tocommunicate with another remote device in addition to the terminal 2012,or instead of the terminal 2012. The terminal 2012 may also beconfigured to communicate with another remote device as well as themovable object 2000. For example, the movable object 2000 and/orterminal 2012 may communicate with another movable object, or a carrieror payload of another movable object. When desired, the remote devicemay be a second terminal or other computing device (e.g., computer,laptop, tablet, smartphone, or other mobile device). The remote devicemay be configured to transmit data to the movable object 2000, receivedata from the movable object 2000, transmit data to the terminal 2012,and/or receive data from the terminal 2012. Optionally, the remotedevice may be connected to the Internet or other telecommunicationsnetwork, such that data received from the movable object 2000 and/orterminal 2012 may be uploaded to a website or server.

FIG. 21 is a schematic illustration by way of block diagram of a system2100 for controlling a movable object, in accordance with embodiments.The system 2100 may be used in combination with any suitable embodimentof the systems, devices, and methods disclosed herein. The system 2100may include a sensing module 2102, processing unit 2104, non-transitorycomputer readable medium 2106, control module 2108, and communicationmodule 2110.

The sensing module 2102 may utilize different types of sensors thatcollect information relating to the movable objects in different ways.Different types of sensors may sense different types of signals orsignals from different sources. For example, the sensors may includeinertial sensors, GPS sensors, proximity sensors (e.g., lidar), orvision/image sensors (e.g., a camera). The sensing module 2102 may beoperatively coupled to a processing unit 2104 having a plurality ofprocessors. In some embodiments, the sensing module may be operativelycoupled to a transmission module 2112 (e.g., a Wi-Fi image transmissionmodule) configured to directly transmit sensing data to a suitableexternal device or system. For example, the transmission module 2112 maybe used to transmit images captured by a camera of the sensing module2102 to a remote terminal.

The processing unit 2104 may have one or more processors, such as aprogrammable processor (e.g., a central processing unit (CPU)). Theprocessing unit 2104 may be operatively coupled to a non-transitorycomputer readable medium 2106. The non-transitory computer readablemedium 2106 may store logic, code, and/or program instructionsexecutable by the processing unit 2104 for performing one or more steps.The non-transitory computer readable medium may include one or morememory units (e.g., removable media or external storage such as an SDcard or random access memory (RAM)). In some embodiments, data from thesensing module 2102 may be directly conveyed to and stored within thememory units of the non-transitory computer readable medium 2106. Thememory units of the non-transitory computer readable medium 2106 maystore logic, code and/or program instructions executable by theprocessing unit 2104 to perform any suitable embodiment of the methodsdescribed herein. For example, the processing unit 2104 may beconfigured to execute instructions causing one or more processors of theprocessing unit 2104 to analyze sensing data produced by the sensingmodule. The memory units may store sensing data from the sensing moduleto be processed by the processing unit 2104. In some embodiments, thememory units of the non-transitory computer readable medium 2106 may beused to store the processing results produced by the processing unit2104.

In some embodiments, the processing unit 2104 may be operatively coupledto a control module 2108 configured to control a state of the movableobject. For example, the control module 2108 may be configured tocontrol the propulsion mechanisms of the movable object to adjust thespatial disposition, velocity, and/or acceleration of the movable objectwith respect to six degrees of freedom. Alternatively or in combination,the control module 2108 may control one or more of a state of a carrier,payload, or sensing module.

The processing unit 2104 may be operatively coupled to a communicationmodule 2110 configured to transmit and/or receive data from one or moreexternal devices (e.g., a terminal, display device, or other remotecontroller). Any suitable means of communication may be used, such aswired communication or wireless communication. For example, thecommunication module 2110 may utilize one or more of local area networks(LAN), wide area networks (WAN), infrared, radio, WiFi, point-to-point(P2P) networks, telecommunication networks, cloud communication, and thelike. Optionally, relay stations, such as towers, satellites, or mobilestations, may be used. Wireless communications may be proximitydependent or proximity independent. In some embodiments, line-of-sightmay or may not be required for communications. The communication module2110 may transmit and/or receive one or more of sensing data from thesensing module 2102, processing results produced by the processing unit2104, predetermined control data, user commands from a terminal orremote controller, and the like.

The components of the system 2100 may be arranged in any suitableconfiguration. For example, one or more of the components of the system2100 may be located on the movable object, carrier, payload, terminal,sensing system, or an additional external device in communication withone or more of the above. Additionally, although FIG. 21 depicts asingle processing unit 2104 and a single non-transitory computerreadable medium 2106, one of skill in the art would appreciate that thisis not intended to be limiting, and that the system 2100 may include aplurality of processing units and/or non-transitory computer readablemedia. In some embodiments, one or more of the plurality of processingunits and/or non-transitory computer readable media may be situated atdifferent locations, such as on the movable object, carrier, payload,terminal, sensing module, additional external device in communicationwith one or more of the above, or suitable combinations thereof, suchthat any suitable aspect of the processing and/or memory functionsperformed by the system 2100 may occur at one or more of theaforementioned locations.

While preferred embodiments of the present invention have been shown anddescribed herein, it will be obvious to those skilled in the art thatsuch embodiments are provided by way of example only. Numerousvariations, changes, and substitutions will now occur to those skilledin the art without departing from the invention. It should be understoodthat various alternatives to the embodiments of the invention describedherein may be employed in practicing the invention. It is intended thatthe following claims define the scope of the invention and that methodsand structures within the scope of these claims and their equivalents becovered thereby.

1-20. (canceled)
 21. A method of determining a quantization step forencoding video based on motion data, said method comprising: receivingvideo captured by an image capture device, the video comprising a videoframe component, the video frame component including a video frame;receiving motion data associated with the video frame component, themotion data including a degree of movement; and determining aquantization step for encoding the video frame component based on themotion data; including: in response to the degree of movement exceedinga threshold degree of movement, choosing a first quantization step forencoding the video frame; and in response to the degree of movement notexceeding the threshold degree of movement, choosing a secondquantization step for encoding the video frame, the second quantizationstep being less than the first quantization step.
 22. The method ofclaim 21, wherein the image capture device is installed on an unmannedaerial vehicle (UAV), and wherein the step of capturing the video occurswhile the UAV is in flight.
 23. The method of claim 22, wherein themotion data is obtained using one or more sensors on-board the UAV. 24.The method of claim 23, wherein the image capture device is carried bythe gimbal configured on the UAV, and the motion data obtained by theone or more sensors includes rotation motion data indicating rotationabout one or more of: a yaw axis, a pitch, or a roll axis of the videocapture device.
 25. The method of claim 24, wherein the one or moresensors are configured on the gimbal.
 26. The method of claim 24,wherein the one or more sensors are configured off-board the gimbal. 27.The method of claim 21, wherein the motion data includes optical flowfield data that demonstrates how light flows within the video framecomponent.
 28. The method of claim 25, wherein the optical flow fielddata is generated with aid of an optical flow field generator based ondata obtained using one or more sensors on-board the UAV.
 29. The methodof claim 28, wherein the image capture device is on-board a UAV, andwherein (1) the optical flow field data is generated, or (2) thequantization step for encoding the video frame component is determined,while the UAV is in flight.
 30. The method of claim 21, furthercomprising encoding the video frame component based on the determinedquantization step for encoding the video frame component.
 31. Anon-transitory computer readable medium containing program instructionsfor determining a quantization step for encoding video based on motiondata, said computer readable medium comprising: program instructions forreceiving video captured by an image capture device, the videocomprising a video frame component, the video frame component includinga video frame; program instructions for receiving motion data associatedwith the video frame component, the motion data including a degree ofmovement; and program instructions for determining a quantization stepfor encoding the video frame component based on the motion data,determining the quantization step including: in response to the degreeof movement exceeding a threshold degree of movement, choosing a firstquantization step for encoding the video frame; and in response to thedegree of movement not exceeding the threshold degree of movement,choosing a second quantization step for encoding the video frame, thesecond quantization step being less than the first quantization step.32. The computer readable medium of claim 31, wherein the image capturedevice is installed on an unmanned aerial vehicle (UAV), and wherein theprogram instructions for receiving the video are executed while the UAVis in flight.
 33. The computer readable medium of claim 32, wherein theprogram instructions for determining the quantization step for encodingthe video frame component based on the motion data are executed whilethe UAV is in flight.
 34. The computer readable medium of claim 32,wherein the motion data is obtained using one or more sensors on-boardthe UAV.
 35. The computer readable medium of claim 34, wherein the imagecapture device is carried by the gimbal configured on the UAV, and themotion data obtained by the one or more sensors includes rotation motiondata indicating rotation about one or more of: a yaw axis, a pitch, or aroll axis of the video capture device.
 36. The computer readable mediumof claim 31, wherein the motion data includes optical flow field datathat demonstrates how light flows within the video frame component. 37.The computer readable medium of claim 31, further comprising programinstructions for encoding the video frame component based on thedetermined quantization step for encoding the video frame component. 38.A system for determining a quantization step for encoding video based onmotion data, said system comprising: an image capture device configuredto capture a video; and one or more processors, individually orcollectively configured to: receive the video captured by the imagecapture device, the video comprising a video frame component, the videoframe component including a video frame; receive motion data associatedwith the video frame component, the motion data including a degree ofmovement; and determine a quantization step for encoding the video framecomponent based on the motion data, including: in response to the degreeof movement exceeding a threshold degree of movement, choosing a firstquantization step for encoding the video frame; and in response to thedegree of movement not exceeding the threshold degree of movement,choosing a second quantization step for encoding the video frame, thesecond quantization step being less than the first quantization step.39. The system of claim 38, wherein the image capture device isinstalled on an unmanned aerial vehicle (UAV), and wherein the motiondata is obtained using one or more sensors, wherein the one or moresensors include one or more of the following: optical sensor, ultrasonicsensor, MVO, gyroscope, GPS, altimeter.
 40. The system of claim 39,wherein the one or more processors are on-board the UAV.